ScienceDirect – Search log analysis: What it is, what’s been done, how to do it

The use of data stored in transaction logs of Web search engines, Intranets, and Web sites can provide valuable insight into understanding the information-searching process of online searchers. This understanding can enlighten information system design, interface development, and devising the information architecture for content collections. This article presents a review and foundation for conducting Web search transaction log analysis. A methodology is outlined consisting of three stages, which are collection, preparation, and analysis. The three stages of the methodology are presented in detail with discussions of goals, metrics, and processes at each stage. Critical terms in transaction log analysis for Web searching are defined. The strengths and limitations of transaction log analysis as a research method are presented. An application to log client-side interactions that supplements transaction logs is reported on, and the application is made available for use by the research community. Suggestions are provided on ways to leverage the strengths of, while addressing the limitations of, transaction log analysis for Web-searching research. Finally, a complete flat text transaction log from a commercial search engine is available as supplementary material with this manuscript.

via ScienceDirect – Library & Information Science Research : Search log analysis: What it is, what’s been done, how to do it.

[tweetmeme source=”shiv17674”


Challenges to finding relevant scientific literature

Mere searching for literature on Google or any other search engine would not cut it; especially for scientific literature searching. We have to help users get to the information they are really looking for. Its challenging. How do you provide tools without getting too noisy and distracting?

The paper referenced below, does not answer all the questions but it does provide some insight (or review)  into the kind of tools that are available, in the field of biology, for researchers today and things we need to do to improve upon them.

In the words of the author –

This review shows the promise of literature data mining and the need for challenge evaluations. It shows how current language processing approaches can be successfully used to extract and organize information from the literature. It also illustrates the diversity of applications and evaluation metrics. By defining several biologically important challenge problems and by providing the associated infrastructure, we can accelerate progress in this field. This will allow us to compare approaches, to scale up the technology to tackle important problems, and to learn what works and what areas still need work.

This comment perhaps sums up the challenge information providers face and compromises that tend to be made –

…, it is unclear how to compare the different approaches; it is also unclear how well a system has to
perform to be useful. To compare technical approaches, different systems must be applied to the same domain via common evaluations. To know how good a system has to be, prototypes must be given to biologists in user-centered evaluations. As learned from previous evaluations in the information retrieval community (Hersh et al., 2001), it is hard to extrapolate from results of batch experiments to predict complex issues of utility and user acceptance of interactive tools. However, even imperfect tools are useful, if they give improved functionality at low cost.

Accomplishments and challenges in literature data mining for biology — Hirschman et al. 18 12: 1553 — Bioinformatics.