Data is at the heart of new science ecosystem – DATA – Research Information

Open data and open APIs offer huge opportunities for research and innovation, writes Elsevier’s Rafael Sidi

via Data is at the heart of new science ecosystem – DATA – Research Information.


ScienceDirect – Search log analysis: What it is, what’s been done, how to do it

The use of data stored in transaction logs of Web search engines, Intranets, and Web sites can provide valuable insight into understanding the information-searching process of online searchers. This understanding can enlighten information system design, interface development, and devising the information architecture for content collections. This article presents a review and foundation for conducting Web search transaction log analysis. A methodology is outlined consisting of three stages, which are collection, preparation, and analysis. The three stages of the methodology are presented in detail with discussions of goals, metrics, and processes at each stage. Critical terms in transaction log analysis for Web searching are defined. The strengths and limitations of transaction log analysis as a research method are presented. An application to log client-side interactions that supplements transaction logs is reported on, and the application is made available for use by the research community. Suggestions are provided on ways to leverage the strengths of, while addressing the limitations of, transaction log analysis for Web-searching research. Finally, a complete flat text transaction log from a commercial search engine is available as supplementary material with this manuscript.

via ScienceDirect – Library & Information Science Research : Search log analysis: What it is, what’s been done, how to do it.

[tweetmeme source=”shiv17674”

Elsevier/ScienceDirect, recent happenings and discussions in blogosphere

I was going to write about recent happenings in Elsevier and ScienceDirect but Abhishek Tiwari has already blogged about it so i’ll just point to his commentary. Indeed its very well thought out and a wonderful read.

I will add my .02 cents though. I don’t directly work for the team that coined the phrase “Article of the Future” but I certainly appreciate the effort and good intentions behind it.  They were trying to (in my opinion) get feedback from the scientific community on where the community wanted Elsevier to go from an article publishing stand point. Yes they could’ve tried to find a few researchers and asked them for input before opening it up to a wider audience but I believe that the more open you are to feedback the better you can respond. I also believe that through the various comments and blog commentaries that are out there, Elsevier has been able to gather a wealth of user input that would definitely not have been possible had they only approached a few people. Users have, through these comments, given Elsevier information on what they care about, what is important to them, what is not so important, what is fluff, etc.

Please keep in mind that this is my personal opinion and i am not privy to any long term strategy for the company as a whole and Elsevier is not responsible for my comments in anyway. If this concept had been titled – “Elsevier’s plan for journal article layout redesign” – I wonder how many people would have bothered to comment about it and offer critical feedback.

The other point that Abhishek covers  that i also wanted to talk about is the NextBio integration with ScienceDirect. A colleague of mine worked very hard to get this feature into ScienceDirect. I firmly believe that this is where publishing is heading. There will be more tools and applications like this that seek to add value to existing online content. In the end the user gains additional insights and (hopefully) enables to get them to the information they seek – faster. I do realize that as with everything else in life, over time there will be flaws detected in the NextBio/SD integration. My hope is that we will continue to learn from that through customer contact, feedback and online commentary.

Here’s is the related extract from Abhishek’s blog. I’ve linked to the article at the bottom.

Last month Elsevier announced the “Article of the Future” project, a very ambitious project proclaimed as trend setter which will redefine how a scientific article is presented online. In the recent past Elsevier has announced several other initiatives to redesign and enhance their online interface ScienceDirect such as Elsevier Article 2.0 contest, Elsevier Grand Challenge and very recently integration with NextBio’s search technology. ScienceDirect which serves as home for more than 2,000 journals will be implementing best ideas of this project. A prototype was rolled out for ScienceDirect hosted journal Cell using content from two previously published articles. At very first glance, one can understand the major enhancement was a hierarchical tabbed presentation of article content where sections of articles such as figures, references can be browsed from top level itself. Secondly the sections accompanied with more real time navigation mechanism. Although these changes look fascinating, I don’t consider this transformation of traditional articles (where they follow a basic linear flow) to a hierarchical presentation(where reader has freedom to browse through article via individualized entry points) as major breakthrough in scientific publishing for very genuine reasons. It is not clear to me how easy it will be to transform a row manuscript into such a staged article? Even if it is easy to do so it is not very first in the list. There have been several other implementation of this kind before this announcement went in public, for example a similar kind of semantically enhanced creative re-use was worked out for a pre-published PLos Neglected Tropical Diseases article. Some may argue the whole affair is great sales gimmick, and Elsevier is trying desperately to polish their paid content in order to retain their institutional subscription which is struggling hard against undercurrent of open access. For example Scholarly kitchen writes Elsevier is just trying to put web 2.0 lipstick on a pig (the traditional print article). According to Scholarly kitchen

Elsevier’s “Article of the Future” looks like an article from the past, with some embedded hyperlinks, some AJAX tabs, two basic social media elements, and not much else.
No doubt “Article of the Future” failed to convince the scientific community on several key areas. It does not mean Elsevier should abandon experimenting with their content, in fact they are experimenting with several other good features such as GenBank linking, ThermoML linking and most notably NextBio integration. Several ScienceDirect articles contain GenBank sequences, these are now linkable to the description of that particular sequence in GenBank. Most of mathematical content on ScienceDirect is now rendered as MathML. I find these kind of enhancement more innovative and appealing from a user perspective, and for me these are real feature that next generation scientific articles should deliver. Integration of NextBio technology with ScienceDirect interface delivers a number of powerful benefits. This integrations has enabled not only accelerated discoverability of key articles but it also provides more insight from existing content through NextBio’s ontology-based semantic framework, extracted correlation with existing knowledge base, and tight integration with publicly available data sources.


There’s one more comment towards the end of Abhishek’s blog that i want to point out –

Annotations is one of the most wanted feature for next generation scientific articles especially shared community annotation and tagging. Last but not least, for god shake stop promoting PDF.

Concepts like annotations have been in the works for many publishers. The problem, however, is that every time we do focused user tests, where i go out and meet users and demo a concept to them; more often that not i get comments along the lines – ‘That’s nice. But i only really care about the pdf. If there was someway for me to get to the pdf faster i’m all for it”. How can you as a business not take that into account when you’re doing a cost/benefit analysis? There is always the concern – Do people really care? Do i really want to invest in all this work for a feature enhancement when the feedback is telling me that it will be hardly used?

Nevertheless, I know Elsevier and other publishers have moved forward with concept ideas that hopefully over time will become useful as more and more people become aware of it. I for one am all for putting features and concepts out there in the open and letting our users tell us what’s wrong with it and in subsequent releases, improving upon it. Maybe the 5 to 10% of  early adopters will help create awareness amongst the vast majority who are not early adopters. Maybe the company brand name will take a beating initially but eventually it will be known for responding to user feedback, valuing user opinion and partnering with the community.

Michael Nielsen » Is scientific publishing about to be disrupted?

This is a very insightful entry by Michael Nielsen.  Due to my bias, I had to immediately skip to Part II before i came back and read Part I. Some of the things i might question – Automatic spelling correct/relevancy ranking/alerting service, etc are indeed offered on Scopus. But whether they are good (I believe they are competitive) is certainly something the users will judge and Michael would qualify as one. I haven’t heard from any of the users i talked to that any of these feature are poor but again it could be my bias.

A great search engine for science: ISI’s Web of Knowledge, Elsevier’s Scopus and Google Scholar are remarkable tools, but there’s still huge scope to extend and improve scientific search engines [6]. With a few exceptions, they don’t do even basic things like automatic spelling correction, good relevancy ranking of papers (preferably personalized), automated translation, or decent alerting services. They certainly don’t do more advanced things, like providing social features, or strong automated tools for data mining. Why not have a public API [7] so people can build their own applications to extract value out of the scientific literature? Imagine using techniques from machine learning to automatically identify underappreciated papers, or to identify emerging areas of study.

via Michael Nielsen » Is scientific publishing about to be disrupted?.

Read the article in its entirety. It  is very insightful and several pointers can be taken away as always.

New product strategies for publishing companies

Love this presentation from Judy Sims… Very nicely summarized at the end of her post and i’ve quoted below. There are specific themes I’ve take the liberty of highlighting below that traditional publishers need to pay attention to –

The economics of media have shifted. Scarcity and abundance have flipped. This has caused hyperdeflation in media value and the end of the blockbuster era. Hyperdeflation can be countered by creating snowballs. The old media blockbuster economy was built on exclusion. The new snowball economy will be built by being open to aggregators, micro-platforms and re-constructors and capitalizing on economies of distribution, coordination and production.

This part i really like

In media 2.0 there are 3 sources of value creation.

Revelation – what’s good. (My comment: includes anything that helps our users get what they want – links from competitors, blogs; anything relevant to users – we should provide it.)

Aggregation – bring elegant organization to the huge amount of data I’m exposed to and

Plasticity – let me get my hands on your content to see how I can add my own value to it. (My comment: i.e. via API, etc)

This new economy requires radically different product strategies: letting the outside in, curation rather than ownership, becoming a part of an ecosystem, moving from mass to vertical content and viewing the site as a service instead of a product.

via The New Economics of Media and – SimsBlog.

Social Media for Scientists: Video Resources for Life Science Researchers

Social media phenomenon is truly taking off. You are clearly seeing this today with the Iran election protests coverage.

If you are a researcher today, you need to be connected with the rise of this form of publishing via Twitter, FriendFeed, YouTube,, (Journal of Visualized Experiments),, etc.

Now clearly information is going to be dispersed across various different websites and services catering to a niche area of research. Traditional publishers need to recognize this trend, support it and get into the business of being the curator of all this information.

Also check this out (post linked to below)

there are an increasing number of video sites and resources for scientists. They range from visualized experiments, to reviews of current research and events, to wacky and fun ‘kitchen science’

via San Diego Biotechnology Network: Biotech Events, Jobs, News, Companies, Directory, Blog, & Calendar » Blog Archive » Social Media for Scientists: Video Resources for Life Science Researchers.

At Google’s Searchology event, executives give search ‘state of the union’

This is brilliant. I can see this kind of feature applicable to scientific research like searching through topics or when looking at an article visually representing the article against its references and cited by articles.

Marissa Mayer and her team are introducing new features to Google’s search results.

Via a tool called “search options,” users can now quickly “slice and dice” their search results in a variety of new ways.

Solar-ovensOn the results page, you can click “show options,” for example, on a search of “solar ovens.” You can then quickly filter the results to see video entries, entries from discussion forums and even user reviews that have undergone “sentiment analysis” — that is, whether the reviewer liked the product (a solar oven) or not.

Also included on the search options is a feature called “wonder wheel,” where Google will draw a simple topic diagram that connects your search query to similar topics. For “solar oven,” you might be given the option to search “how solar ovens work,” or “homemade solar ovens.”

from –