Skip to main content

Convert More Data

Introduction

Once you have made it through a full conversion workflow and published your data in ResearchSpace, what happens when you have more data to add? You have a few options depending on what is new about the new data.

Conversion Options

Edit your Data in ResearchSpace

Now that your data is in ResearchSpace, your team has access to ResearchSpace’s data editing and data creation features as explained in the Setup in ResearchSpace step. If your new data is a relatively small number of new entities or connections between them, then editing directly in ResearchSpace is a good option.

warning

When you edit your data directly in ResearchSpace, it creates a new version that diverges from your original unconverted source data. If you want those changes to apply to versions of your data outside of ResearchSpace, you will need to apply the same changes there.

If you plan on continuing to make significant changes to your original data and want those changes to appear in ResearchSpace, reach out to LINCS to discuss options.

Rerun a Conversion Workflow on New Data

This is the case where you have a new batch of data that follows the same structure and contains the same relationships as your originally converted batch.

An example here would be having new rows for the spreadsheet you converted originally.

Here is how each step will need to change and be repeated:

  • Export Data
    • Repeat the same process.
  • Clean Data
    • If you used a script, then run it on the new data.
    • If you made manual changes, you need to apply those to the new data. Note that if you used OpenRefine, you may be able to open the original project and export the change history from the Undo/Redo tab.
  • Reconcile Entities
    • Any entities that did not appear in the first batch needs to be reconciled externally.
    • Any entities that appear in the first batch and the new batch need to be reconciled against one anothe (i.e., use the same identifier for the same entity in both batches).
  • Develop Conceptual Mapping
    • Because the structure of your data has not changed, you can reuse the same conceptual mapping from your original Develop Conceptual Mapping step.
  • Implement Conceptual Mapping
    • The script or template you used to implemented that conceptual mapping will need to be rerun.
    • If you used 3M, then you only need to replace the input file for your 3M mapping project with your new data and hit run in 3M.
  • Validate and Enhance
    • Again, either the script you used or the manual changes you made will need to be repeated.
info

For new data, you still need to reconcile against external sources, but now you also need to reconcile against your already converted data.

Run a New Conversion Workflow on New Data

If you have new data that does not have the same starting structure as the data you originally converted, or if it contains many new relationships, then you will need to repeat the appropriate conversion workflow on the new data. You may be able to use an edited version of your original conversion workflow if there are similarities between the batches of data.

Publication Options

Your newly converted data can be combined with data you already have in ResearchSpace so that it appears as a single project and named graph in the LINCS triplestore.

Alternatively, if it covers new subject matter and is part of a different research project, it can be published as a separate project in ResearchSpace and be stored in a different named graph.