METHODOLOGY

Unit of analysis: authors who have published in the areas of Bibliometrics, Scientometrics, Informetrics, Webometrics, or Altmetrics, and for whom a Google Scholar Citations (GSC) public profile could be found at the time the data was collected (24/07/2015).

Data sources

  • Google Scholar Citations: a restricted beta release was made on the 20th of July, 2011. It was opened to the general public on the 16th of November, 2011.
  • ResearcherID: author identification system developed by Thomson Reuters. Released in January 2008.
  • ResearchGate: academic social network created in May 2008.
  • Mendeley: social reference manager created in August 2008.
  • Twitter: online social networking service that enables users to send and read short 140-character messages. Released on the 15th of July, 2006.

Search and identification of relevant authors: In order to identify the set of authors relevant to our study (those who have published in Bibliometrics and have a public profile in GSC), several search strategies were used:

  1. Keywords: A search was conducted in the journals Scientometrics, Journal of Informetrics, Research Evaluation, Cybermetrics, and the ISSI conferences (International Conference on Scientometrics and Informetrics) with the goal of extracting the most frequently used and representative words in the discipline. The selected keywords were:
    • Altmetrics
    • Bibliometrics
    • Citation Analysis
    • Citation Count
    • H Index
    • Impact Factor
    • Informetrics
    • Patent Citation
    • Quantitative Studies of Science and Technology
    • Research Assessment
    • Research Evaluation
    • Research Policy
    • Science and Technology Policy
    • Science Evaluation
    • Science Policy
    • Science Studies
    • Scientometrics
    • Webometrics
    All public GSC profiles containing these keywords (GSC allows authors to display up to five keywords) were selected. In addition, the lack of normalization in the use of keywords sometimes forced us to search variants of these keywords. These variants included misspelled words, the same keywords in other languages, etc. As an example, these are all the variants we found of the keyword “bibliometrics”: bibliometric, bibliometría, bibliometria, bibliometric analysis, bibliometric methods, bibliometics, bibliometircs, bibliometric analysis in mining sciences, bibliometric mapping, bibliometric studies, bibliometric visualization, bibliometric., bibliometrics methodology, bibliometrics of social sciences and…, bibliometrics., bibliometrics..., bibliométrie, bibliometry.
  2. Institutional affiliation: the profiles associatied with research centers working on Bibliometrics were downloaded. As an example, the profiles with these verified e-mail domains were selected: cwts.leidenuniv.nl, cwts.nl
  3. Since there may be some authors working in this discipline who have created a public GSC profile, but who haven’t added significant keywords or filled the institution field in their profile, we also conducted a topic search on Google Scholar (using the same keywords as before), and a journal search (all the documents indexed in Google Scholar published in the journals Scientometrics, Journal of Informetrics, Research Evaluation, Cybermetrics, as well as the ISSI proceedings), with the aim of finding authors we might have missed with the previous two strategies. These searches returned roughly 15,000 documents. Additionally, these searches allowed us to find documents written by authors that don’t have a public GSC profile, but which are nonetheless extremely relevant to the discipline.

These searches were conducted on the 24th of July, 2015.

Since Google Scholar Citations gives the author complete control over how to set their profile (personal information, institutional affiliation, research interests, as well as their scientific production), a systematic manual revision was carried out in order to:

  • Detect false positives: authors whose scientific production doesn’t have anything to do with this discipline, even though they labeled themselves with one or more of the keywords associated with it.
  • Classify authors in two categories:
    • Core: those authors whose scientific production substantially falls within the field of Bibliometrics.
    • Related: those authors who have sporadically published bibliometric studies, or whose field of expertise is closely related to Scientometrics (social, political, and economic studies about science), and therefore they can’t be strictly considered bibliometricians.
    In order to set a limit between the two categories, we have decided to consider as core authors those who meet a certain criterion: at least half of the documents which contribute to their h index should be about Bibliometrics. We considered the titles of the documents, as well as the publishing channel where they appeared, focusing our attention in the journals. Our Bradford-like core of journals about Bibliometrics consisted of six journals (Scientometrics, Journal of Informetrics, JASIST, Research Evaluation, Research Policy, Cybermetrics), followed by other LIS journals which also publish numerous bibliometric studies (Journal of Information Science, Information Processing & Management, Journal of Documentation, College Research Libraries, Library Trends, Online Information Review, Revista Española de Documentación Científica, Aslib Proceedings, El Profesional de la Información) and lastly, journals devoted to social and political studies about science (Social Studies of Science, Science and Public Policy, Minerva, Journal of Health Services Research Policy, Technological Forecasting and Social Change, Science Technology Human Values, Environmental Science Policy, Current Science).

In the end, we selected a total of 813 GSC profiles. 397 of them have been considered core authors, and 416 related authors.

These 813 authors were searched by name in ResearcherID, ResearchGate, Mendeley, and Twitter, and in the cases where a profile was found, the indicators provided by these platforms were downloaded. The data collection for these platforms was carried out between the 4th and 10th of September, 2015.

Search and identification of the most cited documents in the discipline: Once we defined the set of authors, we extracted the top 100 most cited documents for each author from their GSC profile. To this set of documents, we added the documents we found on our previous topic and journal search (the third strategy we used to find authors who work on bibliometrics). After deleting duplicates, a set of roughly 41,000 documents remained. In the cases where various versions of the same document were found with different number of citations, the one with the highest citation count was selected. This list was sorted by number of citations, and the top 1000 most cited which were found to be related to Bibliometrics were selected to be shown in the product. This list of 1000 documents was also used to build the journal and book publisher rankings available in the product, which measure the presence and impact (in terms of percentage of documents and citations) each journal and book publisher has in this sample of 1000 highly cited documents.

Data visualization and indicators

This product offers indicators at four levels: author, document, journal, and book publisher.

Authors

General Overview Page: the General Overview author page displays the main indicators found for each author in each of the platforms analyzed in this study. The first column contains the name of the author, and the second one contains the links to the online profiles in different platforms that were found for that particular author. The third and subsequent columns contain some of the indicators available in each platform:

  • Google Scholar Citations
    • Citations since 2010
    • H Index since 2010
  • ResearcherID
    • Citations
    • H Index
  • ResearchGate
    • RG Score
    • Downloads
  • Mendeley
    • Readers
    • Followers
  • Twitter
    • Tweets
    • Followers

Platform-specific author tables: for each of the platforms (Google Scholar Citations, ResearcherID, ResearchGate, Mendeley, and Twitter) there is a separate table that displays all the author indicators offered by the selected platform. These tables can be accessed from the General Overview table by clicking in the appropriate header for each platform. A detailed explanation for the meaning of each indicator can be found below.

Sorting tables: It is possible to sort the author tables by any of the indicators, just by clicking on the name of the indicator. It is also possible to sort the table by names (alphabetically). By default, indicators will be sorted in descending order. When the table is sorted by a given indicator, clicking again in the name of the same indicator will sort the table in ascending order. Text fields will be sorted in ascending order by default.

Core/Related authors: By default, only core authors are displayed. In order to display all authors, check the box with the label “Check to display related authors as well”. When that box is checked, it’ll be possible to tell core and related authors apart because the rows for core authors will be displayed with a grey background. Uncheck the box to display only core authors again.

Search Box: a search box is available to facilitate the task of finding a specific author. If a name or surname is entered in the box, a list of up to five names will appear just below it. Selecting one of them will automatically take the user to the page where that author is found (taking into account the current sorting criteria), and it’ll be easily distinguishable from the rest because the background will be highlighted in yellow.

Special symbols used in the tables:

- --> The author hasn’t created a profile in the platform where the indicator is found.

NA --> The author has created a profile in the platform where the indicator is found, but for some reason the indicator is not available.

List and explanation of the author-level indicators:

Google Scholar Citations
Indicator Definition
Citations Number of citations to all publications. Computed for citations from all years, and citations since 2010
h-index The largest number h such that h publications have at least h citations. Computed for citations from all years, and citations since 2010
i10 index Number of publications with at least 10 citations. Computed for citations from all years, and citations since 2010
ResearcherID
Indicator Definition
Total Articles in Publication List The number of items in the publication list
Articles with Citation Data Only articles added from Web of Science Core Collection can be used to generate citation metrics. The publication list may contain articles from other sources. This value indicates how many articles from the publication list were used to generate the metrics
Sum of the Times Cited The total number of citations to any of the items in the publication list from Web of Science Core Collection. The number of citing articles may be smaller than the sum of the times cited because an article may cite more than one item in the set of search results
Average Citations per Item The average number of citing articles for all items in the publication list from Web of Science Core Collection. It is the sum of the times cited divided by the number of articles used to generate the metrics
h-index h is the number of articles greater than h that have at least h citations. For example, an h-index of 20 means that there are 20 items that have 20 citations or more
ResearchGate
Indicator Definition
RG Score It’s a metric that measures scientific reputation based on how an author’s research is received by his/her peers. The exact method to calculate this metric has not been made public, but it takes into account how many times the contributions (papers, data, etc.) an author uploads to ResearchGate are visited and downloaded, and also by whom (reputation)
Publications Total number of publications an author has added to his/her profile in ResearchGate (full-text or no)
Views Total number of times an author’s contributions to ResearchGate have been visualized. This indicator has recently been combined with the “Downloads” indicator to form the new “Reads” indicator, but the data collection for this product was made before this change came into effect
Downloads Total number of times an author’s contributions to ResearchGate have been downloaded. This indicator has recently been combined with the “Views” indicator to form the new “Reads” indicator, but the data collection for this product was made before this change came into effect
Citations Total number of citations to the documents uploaded to the profile. ResearchGate generates its own citation database, and they warn this number might not be exhaustive
Impact Points Sum of the JCR impact factors of the journals where the author has published articles
Profile views Number of times the author’s profile has been visited
Following Number of ResearchGate users the author follows (the author will receive notifications when those users upload new material to ResearchGate)
Followers Number of ResearchGate users who follow the author (those ResearchGate will receive notifications when the author uploads new materials to ResearchGate)
Mendeley
Indicator Definition
Readers This number represents the total number of times a Mendeley user has added a document by this author to his/her personal library
Publications Number of publications the author has uploaded to Mendeley and classified as “My Publications”
Followers Number of Mendeley users who follow the author
Following Number of Mendeley users the author follows
Twitter
Indicator Definition
Tweets Total number of tweets an author has published according to his profile
Followers Number of Twitter users who follow the tweets published by the author
Following Number of Twitter users the author follows
Days registered Number of days since the author created an account on Twitter

Documents:

For each of the top 1000 most cited documents shown in this list, the basic bibliographic information, as well as the number of citations according to GS (Google Scholar) and WoS (Web of Science) are displayed. For those documents that are not indexed in WoS Core Collection, the number of citations in WoS was calculated by searching the document in WoS’s Cited Reference Search. By doing this we’re trying to highlight the (until now mostly neglected) potential of this tool, which truly offers a wealth of citation data that could be used for the evaluation of non-WoS documents. We also wanted to bring attention to the fact that the percentage of documents with citations in GS and WoS is very similar. In the cases when a book is a collective work, the number of citations is the sum of the citations to all the chapters, as well as the citations to the book in general.

Journals:

The two indicators displayed in the journal table are:

  • Percentage of highly cited articles (included in the list of top 1000 most cited documents) published in a given journal.
  • The percentage the citations to these articles represent, relative to the total number of citations received by the top 1000 most cited documents.

Book publishers:

The two indicators displayed in the publishers table are:

  • Percentage of highly cited books (included in the list of top 1000 most cited documents) published by a given publisher.
  • The percentage the citations to these books represent, relative to the total number of citations received by the top 1000 most cited documents.

Limitations:

Projects of a bibliographic nature like this one can’t ever reach perfection, and it is entirely possible that we may have missed relevant authors. The author selection criteria required for an author to have created a public GSC profile by 24/07/2015 (when the data collection was made), and to publish works on the fields of Bibliometrics, Scientometrics, Informetrics, Webometrics, or Altmetrics.

We’re completely aware that these lists don’t include all the researchers in the area, since some haven’t created a profile, or they haven’t made it public. We should note that we made an exception with Eugene Garfield, one of the fathers of bibliometrics: he doesn’t have a public GSC profile, but we manually searched his production on Google Scholar and computed the same indicators GSC displays. We believe this product would be incomplete without him.

We strongly encourage researchers without a GSC profile, and especially those who have made important contributions to the development of this field, to bring together the scattered bibliographic information Google Scholar has already compiled about their works. Sharing this information would not only greatly benefit them, but it would also be very useful to the rest of the scientific community.

Lastly, we would like to remind users that the data shown in this product reflects the situation at the time the data was collected: 24/07/2015 (GSC profiles), and 10/09/2015 (data from ResearcherID, ResearchGate, Mendeley, and Twitter).