Although full text search functions and search engines exist since decades, intelligent searches are still rarely seen. Users still need to reformulate their queries and need to find related words in order to express the sought concept they have in mind. Neither the IR community nor the search engine builders have recognized yet that with any search a translation task is connected. The language of the information authors needs to be mapped into the language used by the information seekers, even if both speak the same native language. And further, the terms used during retrieval time may be completely different, than the terms used during the information production. Semantic technologies offer a solution for this translation problem.
In this introductory talk we show a number of motivating real world examples where more intelligent retrieval can bridge the language gap between authors and users. Such an intelligent retrieval needs to account for the language utterance of both sides and can use an ontology for translating language utterances into a controlled vocabulary. Since the used terms differ from person to person and over time, the language utterance needs to be acquired and maintained in a continuous process. We argue that companies who have established terminology management or use controlled vocabularies are in a prime position to build up the needed background knowledge for the adoption of semantic searches.
Did you ever recognize that names of entities are not fixed? Your family name may change with marriage or divorce. Streets, places, cities, communities, counties and even countries may be renamed occasionally. Products get renamed as well as departments, companies and organizations. But, what doesn’t change is the use of old names in information. Names of entities are hence only valid for a certain period of time. Again an intelligent search needs to account for this situation. We argue that the time information connected with the validity of a name can be used to extend retrieval functionality. E.g. collecting all relevant information of a department or company if its name has changed, identifying a company under different names in multiple address records, translating historical address for the purpose of geocoding, etc.
Datenlabor Berlin develops customer-specific data products and algorithms from the initial conception up to prototypical solutions, especially but not exclusively for SMEs.
It applies appropriate methods and concepts from data gathering and integration, over data cleansing, text and data mining, data and (social) networks analysis, knowledge engineering and modeling up to the validation and evaluation of the designed algorithms.
Even young, it worked already for a number of well-known customers, including T-Systems, Ontoprise, Vivaki, Ontonym, Europublic and locadeo.