Sunday 17 June 2012

Addendum: la conferencia CRIS2012 y los sistemas CRIS en España


  Desde el punto de vista nacional, la conferencia CRIS2012 arroja algunos resultados positivos: la presencia de representantes de instituciones españolas se ha quintuplicado desde la anterior conferencia CRIS2010 de Aalborg, incluyendo en esta ocasión a la FECYT, así como centros de investigación y universidades en proceso de integración CRIS/repositorio institucional. Entre los avances pendientes, resta aún inaugurar el casillero de presentaciones en conferencias CRIS procedentes de instituciones españolas, estando la aportación nacional restringida por el momento a los euroCRIS membership meetings, donde el año pasado tanto OCU como Sigma realizaron ponencias sobre sus proyectos de desarrollo. Pese a que tanto la duración como el coste de la asistencia a estas conferencias CRIS justifican que no haya habido una presencia mayor este año, algunos de los proyectos presentados en la Jornada GrandIR sobre CRIS y repositorios celebrada el pasado noviembre en Barcelona habría podido perfectamente ser parte del programa de este CRIS2012. Esta situación de infrarrepresentación se equilibrará en todo caso en el próximo euroCRIS membership meeting de otoño a celebrar en Madrid el próximo mes de noviembre, en el que se mostrarán toda una serie de proyectos CRIS de instituciones y organismos en España.

En relación con las cuestiones técnicas debatidas en el programa del evento, cabe mencionar que España como país sigue un camino algo diverso del resto: un proyecto puntero que despierta mucha curiosidad en otros países como el CV Normalizado (CVN) basado en la transferencia de datos desde los CRIS institucionales coexiste con un nivel de implementación de CRIS aún relativamente limitado. Por un lado comienzan a menudear los proyectos de integración CRIS/repositorio para los modelos más consolidados de CRIS (GREC, DRAC, Universitas XXI), y al mismo tiempo la implantación del estándar CERIF en dichos sistemas no ha comenzado aún a producirse en el país.

No es que CERIF sea un estándar imprescindible para el desarrollo de CRIS eficientes -Italia por ejemplo tiene un indice muy elevado de implantación de CRIS institucionales y no están como norma basados en CERIF- pero a efectos de garantizar la interoperabilidad internacional y de incorporar los avances que vienen teniendo lugar en el diseño del modelo de datos de este Common European Research Information Format (incluyendo desarrollos para modelar el impacto social de una investigación o para codificar elementos adicionales de información tales como datos de investigación o research facilities o instalaciones de investigación disponibles), sería muy recomendable contar con al menos algún proyecto de ámbito nacional dedicado al análisis de requisitos para la migración de los modelos de datos actuales a CERIF.

El próximo encuentro de otoño de miembros euroCRIS puede constituir una buena oportunidad para debatir si cabría plantearse una estrategia nacional de implantación de sistemas CRIS institucionales como proveedores normalizados de información sobre producción científica a efectos de su evaluación por parte del Ministerio, de manera similar a como se planifica este proceso en otros países europeos. Funcionalidades añadidas de interés general tales como el estándar ORCID de identificación persistente de autores podrían de hecho integrarse de modo natural en una estrategia de transferencia de información científica de ámbito nacional. Sería interesante en todo caso que alguna universidad o centro de investigación en España reaccionara a la oferta del consorcio ORCID para localizar instituciones interesadas en la implantación temprana de dicho estándar como parte del proceso de consolidación del mismo. En el euroCRIS membership meeting de Madrid habrá asimismo una sesión dedicada al debate sobre identificadores que puede ser el foro ideal para debatir esta materia, dado que se prevé que para el mes de noviembre ORCID se encuentre ya en servicio.

Friday 15 June 2012

CRIS2012 Conference in Prague: Consolidating CRIS Infrastructure in Europe and the way beyond


  A very successful 11th International Conference on Current Research Information Systems (CRIS2012) was recently held in Prague (June 6-9th, 2012) under the motto “e-Infrastructures for Research and Innovation: Linking Information Systems to Improve Scientific Knowledge Production”. A record 154 representatives from 26 countries attended the most crowded euroCRIS biennial conference ever, and the number of submissions for the conference was also the highest so far (with the UK having the largest number of representatives at CRIS2012 and Norway the best rate of submission acceptance). The usual mix of very different professional profiles (researchers, funders, research managers, research office representatives, institutional repository managers, IT managers, developers...) that makes CRIS conferences so special was even further enriched at CRIS2012 by the large number of colleagues who were attending a CRIS conference for the first time.

This event has indeed meant the maturity milestone for euroCRIS, the European Organisation for International Research Information that holds the CRIS conferences every two years (see report for CRIS2010 conference in Aalborg, Denmark at the SONEX blog). euroCRIS has just turned 10 years old as custodian of the Common European Research Information Format (CERIF) standard and as a key stakeholder in the promotion of CRIS Systems for an efficient Research Information Management in Europe and beyond.

If the UK is known to be the most advanced European country in terms of CERIF-based CRIS implementation in HEIs (see recent report from Rosemary Russell, UKOLN), holding the CRIS2012 conference in an Eastern European contry offered the opportunity to realize how the highest momentum in National CRIS System development in Europe is shifting eastwards, with running or completed projects in Slovakia, Slovenia and the Czech Republic itself *.


Official support to the CRIS2012 conference from the Research, Development and Innovation Council of the Czech Republic was in fact one of the key factors for the event being so successful, including a welcome address by the Czech Prime Minister (and President of the Research Council) at the euroCRIS membership meeting reception at Liechstenstein Palace in Prague. Last but not least, a real key contribution to a successful CRIS2012 was the brilliant event organisation provided by Jan Dvořák (InfoScience Praha s.r.o.) and his team.


One of the main outcomes of the 4-day conference was in fact the announcement of the ongoing development of a DRIS or Directory of Research Information Systems which will collect information on running or in-progress CERIF-based CRIS Systems all around the world along with their features and best practices at their implementation and management. This DRIS should serve starting projects to check out for the best solutions and find institutions they may be interested in contacting for the purpose of developing and implementing their own CRIS. CRIS implementation time is a particularly interesting area, since there are large differences among institutions where CRISs are set up, and best practices and guidelines on institutional data collection could be very useful for those universities starting up with the process.

A brief summary of the talks held along the week-long event should include (at least) three main strands as well as a reference to the evolving CERIF data model, currently at version 1.3 with in-progress work at 1.4 as presented by euroCRIS CERIF Task Group leader Brigitte Jörg. These three main strands are (i) added-value services on CRIS Systems, (ii) CRIS functionality extension to research data management and Linked Open Data and (iii) persistent identifier definition and implementation into the CERIF data model.

1. Added-value services on CRISs
As the number of both national and institutional CERIF-based CRIS steadily grows accross Europe, vendors and institutional IT services team up in order to identify new services the system could provide to researchers and institutions. Some of the proposals for enhanced CRIS interoperability and coverage were presented at CRIS2012, such as the JISC-funded CERIFy project for enabling a two-way CERIF-based data exchange between CRISs and Thomson Reuters InCites service or the 'Next-Generation CRIS' currently being developed at the Karlsruhe Institute of Technology (KIT). This project aims to extend CRIS functionality by providing Social Media features, access via mobile devices and advanced Business Intelligence tools for "making numbers talk" to research managers. Finally, semantics was often mentioned too as a relevant enhancement to CRIS systems at several CRIS2012 presentations.

2. CRIS coverage extension to research data and LOD
Research Data Management (RDM) is one of the areas where CRISs could be most useful to the international research community by enabling a systematic management of institutional research data outputs. In order to do so, the CERIF data model must however be previously extended so it's able to cover research data description and management. The CERIF for Datasets (C4D) Project funded by the JISC MRD Programme and led by the University of Sunderland in the UK is working to CERIFy research data and to enable its subsequent codification into CRISs, using marine sciences datasets and an enhanced version of the MEDIN metadata standard as a basis. Required metadata for data description were also analysed at the "Towards the integration of datasets in the CRIS environment" presentation by Italian IRPPS-CNR, which provided an overview of data archives featured in OpenDOAR that offer information about projects. The ENGAGE Project and its CERIF-based metadata approach for a Public Sector Information data infrastructure were introduced by Nikos Houssos, while CRIS enhancement through Linked Open Data features has been recently acknowledged as a relevant workline by euroCRIS through the creation of a specific LOD Task Group.

3. Persistent identifier implementation into CERIF data model
"The need for identifiers beyond systems is a global requirement but also relevant within organization boundaries spanning multiple systems. Various identifier initiatives and systems have started in the scientific domain and beyond. However, they have not yet achieved the required interoperability".
This quotation from the presentation "Entities and Identities in Research Information Systems" delivered by Brigitte Jörg summarizes the much discussed need to integrate author, organisation and project persistent identifiers into CERIF in order to enable LOD-based approaches to succeed. After a first attempt at UUID-based persistent identifier codification was performed last February at the euroCRIS Task Group meeting in Bath, CRIS2012 featured a specific 'IDs, Disambiguation, Interoperation' session where ID implementation requirements were further discussed.

Recently appointed ORCID Executive Director Laurel L. Haak was attending CRIS2012 and had the chance to describe the road ahead for ORCID implementation along the event. Once ORCID released its API earlier this year, its service will be launched along the 4th Quarter of 2012. Researchers will be able to create, manage and share their ORCID record for free at launch time, and ORCID is currently working with interested universities and research centres for signing agreements for early implementation at institutional level.

An updated presentation of the ORCID initiative -as well as an insight on CERIF enhancement for integrating persistent identifiers- will be featured at the 'Topic session' devoted to Identifiers along the forthcoming Autumn 2012 euroCRIS membership meeting to be held in Madrid next November. A preliminary programme for the event is already available and free registration will soon be opened once CRIS2012 is over.


CRIS2012 Prague closed with an outstanding social programme - including a boat cruise along the Vltava and the chance to attend various events at Museum Night Prague. Before that, IRPPS-CNR in Rome had been announced as the next host to the CRIS conference in 2014 and euroCRIS President Keith Jeffery delivered an inspired closing speech on the future of CRISs, CERIF and Research Information Management.

Links to the slides of all presentations mentioned in this post will be offered as soon as they are made available online.



* as well as northwards, with Sweden also implementing a National CRIS and Norway already operating CRIStin, while in Southern Europe, Italy has already used widely implemented institutional CRISs to collect the national research output earlier this year.

Friday 1 June 2012

PEER End of Project Conference: a few reflections



  The fact that the PEER European Project (Publishing and the Ecology of European Research) has managed to establish a fruitful communication channel between publishers and repositories was repeatedly highlighted along the PEER End of Project Conference held last Tue May 29th in Brussels. This ability for fostering a successful collaboration between stakeholders initially at conflicting positions is undoubtedly one of the main PEER outcomes and it would be good news for the Open Access movement as a whole if these communication channels could remain open in the future. As Norbert Lossau put it, favouring pragmatism over ideology could be very useful for jointly outlining evolving business models.

The second most important achievement of the PEER project was being able to establish a tested publisher-repository transfer infrastructure which can be deployed beyond the project. A good number of PEER components and technical findings -such as the PEER Depot dark archive, adoption of the TEI format as an unique metadata interchange standard or SWORD as standard transfer protocol, the way usage is dealt with or the use of the GROBID component for automatic metadata extraction- are potentially re-usable for other ongoing or future publisher-driven transfer initiatives and especially valuable for automatic item transfer into repositories within an hegemonic Gold Open Access scenario that was also frequently predicted along the meeting.

Additional publisher-driven deposit initiatives such as Japanese 'Zoological Science meets Institutional Repositories' were mentioned along the conference as well as COAR involvement in the interoperability strand pottentially offering opportunities for follow-up work. Besides that, the JISC-funded SONEX Group has repeatedly underlined along its analysis of deposit use-case scenarios the strong workflow similarities between PEER and the JISC Open Access Repository Junction (OA-RJ) Project carried out at EDINA in Edinburgh. The RJ Broker feature -which performs a very similar role to the PEER Depot 'moulinette'- is currently being enhanced and will shortly be offered as a service through the UK RepositoryNet+ Project.


The figures associated to the PEER project are certainly impressive: 53,000 stage-two manuscripts (aka post-prints in SHERPA RoMEO terminology) from 241 journals published by 12 mainstream publishers were processed by the PEER Depot resulting in 22,500 EU manuscript deposits (including embargoed papers) released into six different IRs plus into a long-term preservation archive at the KB in The Hague. Two submission routes were designed: automatic publisher-driven deposit and 11,800 invitations to authors for self-archiving their papers, the latter one resulting in just 170 author deposits (or 0.2% of total PEER deposits).


The large difference between deposit figures associated to the two deposit routes led PEER researchers to conclude that authors sympathise with OA but don't see self-archiving as their task, therefore "Green OA not being the key road to optimal scholar information systems". The PEER Usage research -one of the three research team projects within the PEER Research strand along with Behavioural and Economics research- proved also that although current findings reflect the position of a relatively early stage in PEER development, Open Access repositories are not really a threat to publishers (thus confirming the so-called "no effect" publisher hypothesis). In fact, making pre-prints visible in PEER repositories actually generates more traffic to publisher sites, although the ever growing rates of publisher downloads make it hard to supply an accurate measurement of the impact on publishers of post-print availability in repositories. Ian Rowlands from CIBER Research Ltd estimated that publisher full-text downloads increased by 11.4% as a result of earlier version of papers being available at the IR.

Gold vs Green OA

While testing Green Open Access and its economic consequences for the publishing ecosystem in Europe was the main PEER goal and Green OA was the preferred workline when the Project started back in Sep 2008, the Gold Open Access route seems nowadays to be winning hearts and minds of those trying to promote access to research output on a wide basis. PEER has produced quite a number of evidences on the fact that Green OA does not harm journals nor publishers, but in the meantime attention has shifted to Gold Open Access and hybrid journals as a way to ensure that final publisher/PDF versions of the papers are made available.

This is probably the strongest argument in favour of Gold OA, but there are also very good ones that support Green OA. As a result, a lively debate is taking place these days inside the Open Access community on which OA model should receive main support from the government bodies. Many voices argue as well that both models should co-exist, as the research output coverage will be wider as a consequence. And there is finally an important fact to be accounted for after watching PEER result of 99.8 vs 0.2% automatic vs author-driven deposit: author self-archiving rates should not be systematically used as reliable indicators of the strength of Green OA, since there is nowadays a wealth of alternative ways to populate repositories that do not imply self-archiving obligations for authors. In fact CRIS systems, their integration with IRs and the resulting alternative workflows for content ingest into repositories were not mentioned at all last Tuesday despite having already been proved effective by a recently released UKOLN report. When trying to offer a fair estimation of Green OA relevance based on the wider deposit picture, the contribution to repository population from these alternative workflows should also be considered.