iridium – postgraduate evaluation of MANTRA RDM training (2) – Sharing,preservation and licensing unit

From Jack:

The final module of the MANTRA online research data management training is entitled Sharing, Preservation and Rights. The second of two new modules (the last one being Data Protection, Rights and Access) focus on the back end of the research lifecycle.  In this instance, when working on a project the main focus for the researcher will be gathering the data and achieving outputs, there may be little focus initially beyond this. Once work has been completed preservation and sharing may be one of great importance to ensure the greatest possible impact; if research is intended to be cumulative and part of a community then making research data available should be of priority. However, for some there may be restrictions to the extent they make data available and limits to how others are able to use it. These are also covered in this module.

The module outlines the benefits of sharing research data. There are benefits for the researcher their self (scientific integrity, funder requirements and preservation for one’s own future use) and the research community more widely (teaching, impact, collaboration and public record).  Whilst for the most part we may take on good faith the validity of outputs published in journals and other academic papers the module outlines some high profile instances of how some results have been fabricated by researchers. They argue then that making the data available upon outputs are based ensures legitimacy of research and conduct of openness.

Whilst outlining the importance of preserving data for future reuse the difficulties and potential problems of maintaining it over time are highlighted. Rapid changes in file formats and obsolete storage methods are cited as potential future issues for access. Though this may pose an undue hindrance one’s research activities I see it to emphasise the importance or proper and correctly managed data preservation. Reasons are given for placing data into repositories with emphasis. A further emphasis of the module is that whilst for the most part it focuses on the creator of data, also recognises the position of the secondary data user and provides help for them.

For further guidance on data preservation and best practice the recommended reading  of DCC Curation Reference Manual (http://www.dcc.ac.uk/resources/curation-reference-manual) provides in-depth curation techniques split into several chapters (some still in development).

This final module of the MANTRA training completes a comprehensive yet straightforward beginner’s guide to research data management. Having reviewed the content of several online data management guides recently the University of Edinburgh learning units are the ones I would be recommending as an introduction for fellow postgraduate researchers and equally anybody with related interest in research data management.

MANTRA available from: http://datalib.edina.ac.uk/mantra/preservation.html (CC-by licensed)

Advertisements

iridium – postgrad evaluation of MANTRA RDM training – Sharing, Preservation and Licensing unit

From Amy.

The new unit from the MANTRA Data Management Training programme focuses on Sharing, Preservation and Licensing, which follows on well from the previous unit on Data Protection, Rights and Access. The module took about an hour to get through, making notes as I went, and I found it a useful introduction to a topic that I know fairly little about.

The unit discusses the reasons for and against sharing research data and the benefits that can be enjoyed by researchers who do decide to share data. Other guides that I have read on this topic seem to offer a more one-sided view of the debate as they are trying to encourage researchers to share data. While this is understandable, and ultimately the aim of increasing awareness will be that more researchers share more data, it can sometimes make the source appear slightly less credible. For this reason, I was really pleased that this unit included a section on the barriers to sharing research data. For the issue of confidentiality it offered the solution of anonymisation, but it also recognised that financial and ownership issues are sometimes capable of preventing sharing altogether. By recognising that not all research data can be shared, its advice on data that can be shared became more realistic.

The unit provides extensive benefits of sharing research data including scientific integrity, meeting funder requirements, increasing research impact and preserving data for personal future use. This is all underlined by the examples given of real-life cases where the repercussions of not properly preserving/sharing data have caused problems. The unit gives an example of a postgraduate research student whose project was spoiled because they could not access the relevant data. While this is useful, the point is underlined far more seriously by the examples given of researchers who were accused of falsifying data and not having the records to back up their research. One of the benefits given that I could identify with the most was the impact that sharing data can have on teaching. The unit suggests that using research data in teaching is a good way to teach students how to collect and analyse data. Also, in my experience as a student, some of the most interesting teaching sessions I have had were those when lecturers talked about their current or recent projects and showed us data that they had collected for these. It made teaching much more closely related to research and made us, as students, feel more involved with what was going on in the University than when you feel like you’re just being taught from a set syllabus.

The unit also covers issues on licensing and introduces Open Data Commons as a source of guidance and licences that are conformant with the principles set out in the Open Knowledge Foundation’s definition of open knowledge. The unit definitely succeeded in its aims as the information provided, combined with the activities which outlined key terms and definitions, were useful to me as a postgraduate student in consideration of my own research, but also in consideration of data that I am using that belongs to someone else.

See MANTRA http://datalib.edina.ac.uk/mantra/preservation.html

iridium – postgrad evaluation of MANTRA RDM training – Data protection, rights and access unit

From Blanca.

Today I had the opportunity to explore the “Data protection, rights and access” unit of MANTRA. This is a quite new unit which offers plenty of relevant and essential concepts.

Firstly, it discusses the concept of ethics and how ethical requirements need to be taken into consideration with planning a RDM. Ethics, is a serious issue, specially when it involves people. Most of the examples and RDM strategies discussed over the unit concern data about people.

Essential concepts this unit focusses on are privacy, consent and confidentiality. The first step towards an ethical research would be to obtain consent from your research subjects (This way people are given the right to take decisions on the use of their personal data). Next, the researcher needs to make sure he/she will guarantee the protection of subject’s privacy, to do so, the researcher will need to outline confidentiality strategies (this is an agreement between the researcher and the research subjects on how his/her identifiable private information will be handled, managed and disseminated).

Besides ethics, the unit makes relevance on how important are legal considerations for RDM. The 1998 Data Protection Acts regulates personal data handling. Failure to comply with these regulations can incur in extremely severe consequences for organisations and individuals, the unit provides a series of crude examples about it. Even huge institutions such as the NHS are not exempt!

Next, the unit provides with some very useful anonymisation techniques (masking data so that no person identifiers are present), a document with some examples is provided.

Finally, the unit discusses what a are “Intellectual Property Rights” and “Freedom of Information.”

Intellectual property (IP) is all about the creation of the mind. Laws try to make sure owners of these creations are granted with certain exclusive rights when it comes to commercialisation of their creation. There are 2 categories: Industrial property (includes patents, trademarks…) and Copyrights (for literary and artistic works). On the other hand, Freedom of Information (FoI) is about providing the public the right to access information from public bodies.

In general, I found this unit to be quite vast in content. The approach it takes for the explanation of the concepts is really good and concise. However, it didn’t have as many interactive parts as previous units. The unit also provides some other recommended resources.”

MANTRA Data protection, rights and access unit: http://datalib.edina.ac.uk/mantra/dataprotection.html

iridium – early findings on research data management planning (approaches, tools and writing plans)

Below is brief summary of some resources, findings and discussions on research data management plans (DMP) that have been noted along the way since the project start up. This has been collected from several activities and events such as iridium support team use of the MANTRA RDM online training package, project RDM tools assessment, together with attendance at the JISC Meeting (Disciplinary) Challenges in Research Data Management Planning Workshop and the DCC Roadshow North East.

Definitions

“Research data management refers to all aspects of creating, housing, delivering, maintaining, and archiving and preserving data. It is one of the essential areas of responsible conduct of research.” – MANTRA

“Plans typically state what data will be created and how, and outline the plans for sharing and preservation, noting what is appropriate given the nature of the data and any restrictions that may need to be.” – DCC

Purpose:

  • to assist in planning the research data management (RDM) aspects of your research
  • to assist you in making RDM decisions
  • to identify the RDM actions required
  • to highlight areas that need further thought
  • to provide a record of decisions made and actions taken

http://www.northumbria.ac.uk/static/5007/ceispdf/dmpguide.pdf

Attribution: Northumbria University School of Computing, Engineering & Information Sciences, 2012. CC-BY-SA

Benefits:

The benefits of managing your data include:

  • Meeting funding body grant requirements.
  • Ensuring research integrity and reproducibility.
  • Increasing your research efficiency.
  • Ensuring research data and records are accurate, complete, authentic and reliable.
  • Saving time and resources in the long run.
  • Enhancing data security and minimising the risk of data loss.
  • Preventing duplication of effort by enabling others to use your data.
  • Complying with practices conducted in industry and commerce.

– MANTRA

Local DMP practice/DAF survey results

From our survey (128 projects), findings were were 23% of projects have a formal research data management plan for institutional as a whole, with further 33% having a partial RDM plan (by Faculty split suggested a slightly higher proportion in line with a likely higher proportion of Research Council awards). I expect this is similar across sector? Open Exeter project reported ‘few researchers have experience of completing a data management plan ‘ from their DAF survey.

Policy

Institutional policies on DMP, some examples (see also DCC website):

Edinburgh, point 3: http://www.ed.ac.uk/schools-departments/information-services/about/policies-and-regulations/research-data-policy

Lincoln (draft), point 4: https://github.com/lncd/RDM-Policy/blob/master/Lincoln%20RDM%20Policy.md

Warwick , point 7: http://www2.warwick.ac.uk/services/rss/researchgovernance_ethics/research_code_of_practice/datacollection_retention/reseatch_data_mgt_policy

Funder polices on DMP:

Various requirements at application and funded project stages. For exampe:

ESRC: http://www.esds.ac.uk/create/esrc/dataman/and http://ukdaresearchdatamanagement.blogspot.co.uk/

NERC: http://www.nerc.ac.uk/research/sites/data/dmp.asp?cookieConsent=A

MRC: http://www.mrc.ac.uk/Ourresearch/Ethicsresearchguidance/datasharing/DMPs/index.htm

See also DCC mappings across 6 funder policies to generic DCC Checklist (July 2011)

Training and guidance on research data management planning

External institutional support pages guidance:

http://www.admin.ox.ac.uk/rdm/dmp/plans/

http://www.ed.ac.uk/schools-departments/information-services/services/research-support/data-library/research-data-mgmt/data-mgmt/why-research-data-policy

http://www.gla.ac.uk/services/datamanagement/creatingyourdata/dataplanning/

MANTRA training package covers DMP.

Advocacy for why DMP is important:

“… the role of data management for a new researcher as being one of those essential skills that you really ought to get at the same time as you learn how to handle your references, as you understand methodology, as you get to grips with the theory that is going to set the frame by which you do your research. And it sits alongside those and it’s equal to them …” – Professor Jeff Haywood, Vice Principal, CIO & Librarian, University of Edinburgh talks about the role of of data management for  PhD students and early career researchers

“… it actually gives you a really good framework and for my postgrads now I am pointing them towards that and saying hey,  you know, take a look at that because it will help you to think about how you’re going to gather your data and how you are going to look after it from  the beginning to the end of the project. It gives you a framework to deal with it rather than realizing too late that you  haven’t done some things that you should have done and  therefore you’ve made your life and perhaps actually cause problems for you with the use of data subsequently or sharing your data is made that more difficult.” – Professor Jeff Haywood, Vice Principal, CIO & Librarian, University of Edinburgh talks about the role of data management for  PhD students and early career researchers

Attribution: EDINA and Data Library, University of Edinburgh. Research Data MANTRA [online course]. http://datalib.edina.ac.uk/mantra

Also, available as a video.

DCC resources:

Discussion on DMP approaches and reviewing the styles of questions,  format, how ‘active’ in approach

Oxford DMPOnline Project wrote on and discussed the detail of research data management plans – very interesting reading. They discussed the concept of ‘plan questions’, ‘project questions’, ‘data questions’ and common issues they found when reviewing DMPs – such as compound questions, duplicates,  and individual plan unique questions [link to table XLS]. On DMP style they noted – discursive versus concise, ‘metadata’ versus ‘data’ questions, option to add possible responses and overall gaps in DMP scope and plans that lead to quantified expected data sizes/acquisition rates (resulting in actionable identification of requirements that can be report to a central service provider, as a result of the plan).

The conclusions should be read:

“.. difficult work, since there are many possible questions ..”

“.. avoid asking ambiguous questions ..”

“.. avoid asking for the same or similar information multiple times..”

“..unique questions not covered by the DMPonline ..”

“.. all of the available question sets have drawbacks ..”

“.. in terms of comprehensiveness, the best may be the enemy of the good enough..”

“.. devise and standardize the best possible set of questions for different constituencies of user ..”

[and more …]

http://datamanagementplanning.wordpress.com/2012/03/27/dmp-questions-comparisons-and-conclusions/

DMPs for different audiences – from targeted plans to template author background ‘bias’/priorities

Life-cycle stage specific: from (conception?), pre-award, post-award, to post-project.

Postgrad research project versus PI bidding for new funding.

Curator/archiver versus researcher orientated.

DMP online authoring tools or offline Word/PDF templates

Online tool has many useful advanced staging, customisation & collaboration features.

Online systems:

DMPOnline – pre-award, post-award, post-project, templates for Research Council/major funders, default templates, post-grad, etc. Features – add additional questions from DCC checklist, save, share/collaborate, copy, export to Office files, etc.

DCC/DMPOnline:

  • Introduction and Context
  • Data Types, Formats, Standards and Capture Methods
  • Ethics and Intellectual Property
  • Access, Data Sharing and Re-Use
  • Short-Term Storage and Data Management
  • Deposit and Long-Term Preservation
  • Resourcing
  • Adherence and Review

DCC website/DMPOnline

DMPOnline tools training:

http://www.dcc.ac.uk/webfm_send/879
http://www.dcc.ac.uk/webfm_send/881
http://www.screenr.com/Syo

DMPOnline advocated by MRC (ref 14), etc.

Institutional customisation or tailoring for local use available.
GitHub code: https://github.com/DigitalCurationCentre/DMPOnline (Ruby on Rails/MySQL)

Offline templates (Word/PDF format):

Some users do not like online systems, are overwhelmed by array of features/customisation options and just want a ready to go familiar Office document to type into.

DATUM: http://www.northumbria.ac.uk/sd/academic/ceis/re/isrc/themes/rmarea/datum/action/outputs/?view=Standard

Shotton 20 Questions http://datamanagementplanning.wordpress.com/2012/03/07/twenty-questions-for-research-data-management/ [CC:BY 3.0]

Bath 360 (postgrad-specific): http://blogs.bath.ac.uk/research360/2012/03/postgraduate-dmp-template-first-draft/

DMPTPsych(York) (postgrad?): http://www.dmtpsych.york.ac.uk/docs/pdf/dmpt_guidance.pdf

MRC: http://www.mrc.ac.uk/Utilities/Documentrecord/index.htm?d=MRC008617

Wellcome Trust: http://www.wellcome.ac.uk/About-us/Policy/Spotlight-issues/Data-sharing/Guidance-for-researchers/index.htm

Wider DMP discussions

  • extent of pre-population of template with default institutional information to aid researcher versus reducing actual thinking/planning for RDM
  • experience in information banks of DMPs, shared pool, ‘successful’ DMP
  • DMP online tools – metadata transfer protocols between systems/integration in existing RIM systems
  • DMP training needs, online/in person, embedding with training – ‘dual service engagement‘ (i.e. Monash), see DCC ‘support researchers with DMP
  • DMP embedding in existing institutional processes, internal peer review, funder review, DMP fields (i.e. data size) resulting a RIM system flag or automatic central service trigger
  • time requirements for writing a plan – minimal plans/resources required to support/advise/review DMP
  • DMP auditing – institutional, Funding Council, etc.
  • wider use as knowledge/information base for forward institutional planning, storing DMP (or parts of ) with an  archived data set, re-use to support metadata population

JISC Research Data Management Planning Projects

Strand B: DATUM, DMPSPsych, History DMP, etc.

Next steps for iridium project:

Reporting on initial user testing of DMPOnline and other templates, authoring a local DMP template and hosting options.

iridium – fourth postgraduate student feedback on MANTRA RDM (4)

This is the fourth in a series of postgraduate blog posts reviewing the MANTRA training package from different discipline perspectives.

This post is from Blanca, postgraduate student in flood management:

“I have recently joined the iridium postgraduate team. My first task was to complete the online course by Data Library and EDINA, University of Edinburgh.

The online course comprises several learning units and software practical. I was only required to follow the online learning units, some units are still in development and I am looking forward to explore them in the near future.

The units I had the opportunity to explore are the following:

Unit 1 – Research data explained: really useful if you want to know the difference between what is research data and what is not. At the beginning I thought everything was research data!

Unit 2 – Data management plans: a data management plan never ends! That’s what I took from this chapter. It also reminded me that there is a public responsibility that one takes when doing research, thus it is important to correctly and responsibly handle research data.

Unit 3 – Organizing data: perhaps my favorite chapter. I helped me design a naming convection and versioning methodology for my files, something I will continue doing for the rest of my life.

Unit 4 – File formats and transformation: quite informative, especially the concept of files normalization and the importance of this when sharing data, it makes data more flexible and comparable.

Unit 5 – Documentation and metadata: it was really useful to know the difference between documentation and metadata. However, it was difficult for me to understand metadata categories.

Unit 6 – Storage and security: This was more like a horror movie “The nightmare of PhD students”. It really made me realize the price of losing my research data. I haven’t generated huge amounts of data, but the truth is that I need to review backup strategy. This chapter has really good videos with extremely important messages “Backup is the most important part of RDM”.

As an overall, I found this online course quite useful. I would totally recommend it! Besides providing me with the background and the jargon to carry out my work within iridium, it has actually helped me with my own project management. I believe it would be a great idea to introduce this kind of course to all postgraduate students at Newcastle University as part of the postgraduate research development program. This kind of course would be a great complement to the workshops that we have already. It also provides great skills for our future professional life.”

iridium – third postgraduate student feedback blog on MANTRA RDM training

Continuing the series of posts on the MANTRA RDM online training tool from postgrads in different  disciplines.

Third is post is from Jack, postgraduate student in Philosophy.

“I recently read through MANTRA, the online research data management training guide from Edinburgh University, as a novice beginning work on the Iridium project. By novice I mean my own current research is broadly in Philosophy at masters level. As such, with regards to my own area of study, research data management took little priority beyond maintaining a bibliography of secondary texts I had read for reference. My hope for the training was to widen the context of research data across the spectrum of methods and data types.

The first section proper (“Research data explained”) I found to be a useful introduction to the remit of what may constitute research data, how it may be generated with practical examples for each. The next section (“Data Management Plans”) emphasises the importance of having a management plan, placing it in the context of one’s own research by asking what best suits the type of data the researcher uses. It then breaks down the general components of a data management plan with a checklist of what should go into each:

(c) EDINA and Data Library, University of Edinburgh. Research Data MANTRA [online course], http://datalib.edina.ac.uk/mantra CC:BY

(c) EDINA and Data Library, University of Edinburgh. Research Data MANTRA [online course], http://datalib.edina.ac.uk/mantra CC:BY

With “Organising Data” the need for good file management will be familiar to researchers with lax conventions for saving. With the best will in the world whilst one may have the confidence they will remember where data was saved and under what title at the time, a few weeks or months down the line this becomes a cause of frustration. Intelligible and simple ways of saving are provided to avoid this. Reference to bulk renaming tools is given to aid creating conventions for mass files. The RDM Blankety Blankstyle summaries at the end of the sections are a good way to check what information has been absorbed:
 (c) EDINA and Data Library, University of Edinburgh. Research Data MANTRA [online course], http://datalib.edina.ac.uk/mantra CC:BY


(c) EDINA and Data Library, University of Edinburgh. Research Data MANTRA [online course], http://datalib.edina.ac.uk/mantra CC:BY

I found the video in the “Documentation and Metadata” beneficial as it relates directly with recording metadata in the social sciences. The student outlines the issues related to recording metadata in social sciences. She also highlights how recording metadata can be important for one’s own sake in reminding what methods were used and for what reasons when returning to projects later on. With other videos for an introduction I think some of the speakers in the video I found maybe too in depth though the enthusiasm is creditable.The final unit “Storage and Security” is by far the largest section and could act as a standalone training module. It goes into detail around the importance of regular backups and appropriate storage.  Whilst most researchers will be familiar with the pains of losing data they have worked on a few first hand horror stories from those who have lost data are given to affirm the importance of this. The concluding “Recommended Resources” section provides recent documents and webpages for further introductory guides to research data and management plans, which open embedded into the training page. For anybody looking to go on to create a management plan I found the Sarah Jones guide from the Digital Curation Centre to provide a detailed yet straightforward guide to doing so.To surmise, with my own work being broadly based in the humanities understanding what constitutes research data seemed less clear cut than other faculties. Whereas with chemistry or biology the raw data is readily demarcated, as that which is studied in a laboratory, I initially found it less discernible with more academic essay writing. However, the way in which the training asks to question your own data generally and covers a great range of data types. This provides a greater direct understanding in relation to my own work and research data more generally. Splitting the different areas of RDM into separate units I’ve found beneficial. Following interviews conducted for Iridium I have since gone back to individual sections to go through the area again to clarify any doubts.”

iridium – second postgraduate student feedback on MANTRA RDM training (2)

As previously reported, the postgraduate support team completed the MANTRA RDM online training tool a while back. Here is the second in a series of postgraduate blog posts reviewing the training package from different discipline perspectives.

Seond post is from Sathish, postgraduate student in signal processing.

“I had a chance to complete online resource for Research Data Management (RDM) Training-MANTRA recently. It helped me understand basics of the RDM. As a part of Iridium support team I carried out interviews with researchers and staff across various disciplines of the university, to understand the problems faced with data management. MANTRA helped a lot during this process.

The first module of MANTRA started with ‘Research Data Explained’, it gave basic information on data and on different types of data. Initially, Research data and research records meant all the same for me, it was too nice to know difference between them.

The most informative module -‘Data Management Plans’. I believe that every researcher at the beginning of the research should make an effective data management plan. I would strongly recommend this section for the researchers who are at the start as this contains very useful information on what and how management plans have to be made. It is easy to get lost in research; this helps in following the track of the research and helps to come back on track when lost. Being in my second year of my research I feel that I should have taken this module during my beginning of my research.

‘Organizing Data’ looks like the basic stuff of naming the files and folders, but I like to caution the researchers as non organized research data will make you mad. During process of the research we tend to do many things (useful and non-useful) when it is clearly organized file retrieval becomes very simple. Moreover it will be very easy to refer back to the files even after many years of the research.

‘File Formats and Transformation’. Initially, material gave very good idea about, what are open and closed file formats followed by the significance of text file formats. The potential risks while changing file formats  was worth full  and now it is significant especially to me, as I am thinking of changing the file formats of research data I hold. I am going to derive the list of potential problems that can occur while changing file formats so it can be used now and any time later. The training led me through different file formats and transformation. Although I knew about the zip compression technique, ‘.tar’ format compression is very new to me.

Importance of the data documentation is well presented. I like the video of the PhD student, her explanation on the importance of documentation/metadata was very straight forward.  I understood the crucial importance of metadata when she said, ‘Documenting results are not just enough but it is also very important to document the why the experiment was carried out.’

Significance of backups and consequence of the not keeping the data secured is well explained. There were lot of the nicest videos which showed significance of data security and ways to keep the data secured (encryption techniques).

As a whole, all the modules were very useful and interactive. Before starting the resource I had no clue of what RDM is, I initially believed that the training will be only useful for the purposes of the Iridium but now I understand how useful it will be for researchers. Summary pages at the end of each module were really useful which allowed me recollect and could be used in the future as a quick reference.

Mantra training will be very valuable for everyone in research and what I feel after all the time with MANTRA is, I should have gone through this material at my start of my PhD although I at least had chance now.”

The development of MANTRA RDM training package has recently been blogged about on the Jorum website.

%d bloggers like this: