Coder

From LMU BioDB 2015
Jump to: navigation, search
Gene Database Project Links
Overview Deliverables Reference Format Guilds Project Manager GenMAPP User Quality Assurance Coder
Teams Heavy Metal HaterZ The Class Whoopers GÉNialOMICS Oregon Trail Survivors

The coder is the resident expert on the technology being used—assorted software, file management, version control, some troubleshooting, some programming. He or she coordinates with Drs. Dahlquist and Dionisio in extending GenMAPP Builder code and making new versions. GenMAPP Builder is written in Java and is built on open source pure-Java libraries. Source code is hosted on GitHub and built using Apache’s ant utility.

Guild Members

Milestones

Milestone 0: Working Environment Setup

Because the machines in the Seaver 120 computer lab have already been set up for this process, the information below is listed primarily for documentation and troubleshooting purposes.

Milestone 1: Version Control Setup

  1. Get a GitHub account and pass it to Dr. Dionisio so that you can be added as a developer of the XMLPipeDB project on GitHub.
    • Once you are set up as a developer, you can clone and push your GenMAPP Builder source code.
  2. Create a GitHub branch of xmlpipedb for your team.
  3. (with QA) Commit and push relevant source data to the GenMAPP Gene Databases folder of your GitHub branch
    • You can always verify what is publicly visible on your branch by visiting the XMLPipeDB GitHub website, choosing your branch from the Branch dropdown menu, then inspecting the code that is visible there.

Milestone 2: “Developer Rig” Setup and Initial As-Is Build

  1. Install core software for developing, building, and testing prototype versions of GenMAPP Builder:
    • Java developer tools: JDK 8 (which, at this writing, is JDK 8u65)
    • A git client (for interacting with GitHub)
    • Any tool that can unpack .gz and .zip files (we are using 7-zip on the Seaver 120 machines)
    • XMLPipeDB Match utility
    • Development environment: while any will do, Eclipse is the specific one that most XMLPipeDB developers have used:
      • Download and install Eclipse from its download web site. Either Eclipse IDE for Java Developers or Eclipse IDE for Java EE Developers will work.
      • Eclipse includes ant so you do not need a separate ant installation unless you plan to build GenMAPP Builder outside of Eclipse
      • If you want to use ant outside Eclipse, please visit http://ant.apache.org.
  2. Follow the instructions in the GenMAPP Builder Project Setup and Initial Build section of this wiki page in order to:
    • Set up a functioning Eclipse development environment for your branch of GenMAPP Builder.
    • Build your own copy of GenMAPP Builder from scratch.
  3. (with QA) Get a full import-export cycle done.
  4. (with QA) Decide on a file/version management scheme/system.

As needed, coders may arrange for a walkthrough or other help session with Dr. Dionisio if there are any issues with the procedures on this guild page.

Milestone 3: Species Profile Creation

Follow the instructions in the Adding a Species Profile to GenMAPP Builder section of this wiki page in order to:

  • Add a species profile to the GenMAPP Builder code base.
  • Customize the species profile with the species name in the OrderedLocusNames record of the Systems table.
  • Customize the Link field in the OrderedLocusNames record of the Systems table to hold a URL query with ~ standing in for the gene ID.
    • (with QA) The URL would need to be determined first, of course.

Milestone 4: Species Export Customization

  1. Based on observations from the GenMAPP User and QA, determine and document (as thoroughly as possible) any other modified export behavior that GenMAPP Builder will have to manifest for this species.
  2. Implement this export behavior.
  3. As needed, commit and push your work to your GitHub branch.
  4. Additional milestones will depend on how the rest of the project goes, and the bugs/features generated by that work.
  5. Document/log all work done, problems encountered, and how they were resolved.
  6. When your work is complete, issue a GitHub pull request to merge your branch into the main development line.

GenMAPP Builder Project Setup and Initial Build

This section of the page seeks to provide a guide for building new versions of GenMAPP Builder. You can only run GenMAPP and MAPPFinder on Windows, but you can build and run GenMAPP Builder on any platform that supports PostgreSQL and JDK 8.

Although there are many ways to update and maintain GenMAPP Builder code, for uniformity these instructions will assume the use of Eclipse for viewing, modifying, and updating GenMAPP Builder. The main benefit of Eclipse is that it is largely a one-stop shop for performing all of these tasks.

The instructions listed in this Setup section need only be performed once. Once done correctly, you will primarily be doing what is described in the Common Tasks section.

Prerequisites

  1. Make sure that you have already accomplished the version control setup milestone (Milestone 1).
  2. Make sure that you have already downloaded and installed the software mentioned in Milestone 2 (first item).

GitHub Repository Clone Setup

  1. Determine the desired location (on your development computer) for your local copy of the XMLPipeDB GitHub repository.
  2. cd to this location.
  3. Clone the repository:
    git clone https://github.com/lmu-bioinformatics/xmlpipedb.git
  4. cd into the clone folder xmlpipedb:
    cd xmlpipedb
  5. Switch to your branch:
    git checkout <your-branch-name>
  6. You should see a message like this:
    Branch <your-branch-name> set up to track remote branch <your-branch-name> from origin. Switched to a new branch '<your-branch-name>'
    …this is OK and expected.

Eclipse Workspace Setup

Initial-eclipse-workspace.png
  1. Run Eclipse.
  2. When Eclipse asks you to select a workspace, click the Browse… and navigate to your repository clone folder. Select it and click on Open.
  3. Verify that your repository clone folder is what is listed as the Workspace:. Click on OK.
  4. You will see an introductory display with assorted menu items. Click the Workbench button (upper-right corner, with a 3D arrow for an icon).
  5. If everything went well, you should now see an empty developer area, showing your repository clone folder in the window title with a Project Explorer tab on the left (see screenshot for an example).

Java Project Setup

  1. Right-click anywhere within the empty Project Explorer tab and choose New > Project… from the menu that appears.
  2. Choose Java Project from the list of “wizards” and click on the Next > button.
  3. On the next panel, enter gmbuilder as the Project name:. Note how the Location: field underneath, although disabled, should show your cloned repository folder location with /gmbuilder appended to it.
  4. Verify that the JRE section is showing Java 8 or 1.8. If not, talk to Dr. Dionisio.
  5. Click on the Finish button (yes, there is no need for further configuration).
  6. If Eclipse asks you whether you want to open the “Java perspective,” respond with Yes. You may check the Remember my decision checkbox if you wish.
  7. You should now see a gmbuilder project folder in the Project Explorer tab.
  8. If the gmbuilder project folder icon shows a red x icon, do the following:
    • Right-click on the gmbuilder item and choose Properties from the menu that appears.
    • Click on the Java Compiler item. Click on the item name itself, not its triangle.
    • You should see that Compiler compliance level: is set to 1.6. If so, uncheck the Enable project specific settings checkbox.
    • Click the Apply button. If Eclipse asks you whether it is OK to rebuild the project, click Yes.
    • Click on OK to dismiss the Properties dialog.
    • See if the red x icon has disappeared.
  9. If the red x icon persists, please show your setup to fellow guild members or Dr. Dionisio for troubleshooting.

Initial Build

  1. Open the gmbuilder project by clicking on the gray triangle to the left of its name.
  2. Within the gmbuilder Java project is a file called build.xml. It should have an icon that appears to include an ant.
  3. Right-click on build.xml and choose Run As > Ant Build... (the one with the ellipses) from the menu that appears.
  4. In the Edit Configuration dialog that appears, uncheck dist if it is already checked.
  5. Check on the clean and dist items in the Targets tab. The Target execution order section near the bottom of the dialog should say clean, dist.
  6. Click the Run button. The computer will work for a bit. You will see some messages scroll up on the Console tab if that tab is visible.
  7. When it is done, right-click on the gmbuilder project folder and choose Refresh from the menu that appears (F5 is its keyboard shortcut).
  8. You should see a dist folder appear inside the gmbuilder project folder.
  9. This is your personally-built copy of GenMAPP Builder. Its contents correspond to the extracted contents of the gmbuilder-3.0.0-build-5.zip file that was downloaded in class.

Adding a Species Profile to GenMAPP Builder

All of this work happens in the Java perspective, so switch to that first if you’re not already there.

Create the Species Profile

  1. Expose the contents of the src folder.
  2. Right-click on the edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles package and choose New > Class from the popup menu.
  3. In the dialog that appears, enter the following:
    • Name: name-of-your-species-without-spacesUniProtSpeciesProfile (in camel case: no spaces, capitalizing the first letters of each word)
    • Superclass: edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles.UniProtSpeciesProfile (you can also click on Browse... to navigate to this if you don’t feel like typing)
  4. Click Finish. There should now be a new .java file within the edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles package (the one you just created).

Customize the Species Profile

  • Open the file that you have just created. It should appear in the editor area of Eclipse.
  • Override the method that supplies the name of the species and the description of the profile: add the following constructor block right below the public class line in the new file. Remember to customize according to your particular species; the portions that need to be customized are highlighted in asterisks.
public ***NameOfYourSpecies***UniProtSpeciesProfile() {
    super("***Genus species***",
        ***taxonIDOfYourSpecies***,
        "This profile customizes the GenMAPP Builder export for " +
            "***Genus species***" +
            " data loaded from a UniProt XML file.");
}
  • To customize the species profile with the species name in the OrderedLocusNames record of the Systems table as well as a link query for that same record, add the following method block right below the constructor block that you added above. Again, the key information to customize is highlighted in asterisks.
@Override
public TableManager getSystemsTableManagerCustomizations(TableManager tableManager, DatabaseProfile dbProfile) {
    super.getSystemsTableManagerCustomizations(tableManager, dbProfile);
    tableManager.submit("Systems", QueryType.update, new String[][] {
        { "SystemCode", "N" },
        { "Species", "|" + getSpeciesName() + "|" }
    });

    tableManager.submit("Systems", QueryType.update, new String[][] {
        { "SystemCode", "N" },
        { "Link", "***species-specific-database-link***" }
    });

    return tableManager;
}

Additional customization, particularly with regard to the exported data, will depend on the species. Communicate with your QA to see if additional customization is needed. If the additional customization is not too complicated, you might be able to do the work yourself with some instructions. However, if the customization is too difficult, Dr. Dionisio will probably be the one to do the work.

Add the Species Profile to the Catalog of Known Species Profiles

The last step involves actually making GenMAPP Builder know that your new species profile exists. This involves a change in an existing file:

  • Under edu.lmu.xmlpipedb.gmbuilder.databasetoolkit.profiles, open UniProtDatabaseProfile.java.
  • Near the top of the file is a block that looks like this:
super("org.uniprot.uniprot.Uniprot",
    "This profile defines the requirements "
        + "for any UniProt centric gene database.",
    new SpeciesProfile[] {
    new EscherichiaColiUniProtSpeciesProfile(),
    new ArabidopsisThalianaUniProtSpeciesProfile(),
    new PlasmodiumFalciparumUniProtSpeciesProfile(),
    new VibrioCholeraeUniprotSpeciesProfile() });
  • What you want to do is add the species profile that you just created to this block. If your species profile is called MySpecialUniProtSpeciesProfile, your modified code should look like this:
super("org.uniprot.uniprot.Uniprot",
    "This profile defines the requirements "
        + "for any UniProt centric gene database.",
    new SpeciesProfile[] {
    new EscherichiaColiUniProtSpeciesProfile(),
    new ArabidopsisThalianaUniProtSpeciesProfile(),
    new PlasmodiumFalciparumUniProtSpeciesProfile(),
    new VibrioCholeraeUniprotSpeciesProfile(),
    new MySpecialUniProtSpeciesProfile() });
  • Essentially, you need to add an item to the comma-separated list, beginning with new, followed by the species profile name, finally followed by ().
  • Save your changes, do Organize Imports to eliminate any red errors, and try a test build!

Build, Test, and Possibly Commit

  1. Create a new distribution of GenMAPP Builder based on Creating a Distribution.
  2. Perform a new export run with this version of GenMAPP Builder (you can skip the import steps and use the same PostgreSQL database if it’s available).
  3. Check the Systems table in the resulting .gdb to see if it contains the custom information:
    • Open the .gdb in Microsoft Access, then open the Systems table.
    • Look for the record for OrderedLocusNames. Your species name should appear under the Species column and your link URL should appear under the Link column.
  4. If all goes well, commit your code as described in Updating and Committing Code. You have now officially contributed to the XMLPipeDB project :)

Common Tasks

The tasks in this section reflect the typical development cycle.

Updating and Committing Code

  1. Right-click on the gmbuilder project folder and choose Team > Synchronize Workspace from the menu that appears.
    • If Eclipse asks you whether it is OK to enter the Team Synchronizing perspective, respond with Yes. You may check the Remember my decision checkbox so that this does not happen again.
  2. You will be switched to the Team Synchronization perspective.
  3. The presence of left-pointing, blue-arrowed files means that the server has new updates for you to download. Right-click on the gmbuilder project folder and choose Pull from the menu that appears. You can also click the Pull button from the toolbar of the Synchronize tab.
  4. It is good “developer etiquette” to build a new distribution from scratch when you’ve received updates prior to committing your own changes. Thus, after the update, return to the Java perspective and run a quick clean, dist cycle to make sure that there are no errors.
  5. If everything works out, go to Synchronize Workspace again. If there are new updates (in the tiny amount of time since you last updated!), test things again.
  6. Eventually, you will see a Synchronize tab with no incoming code. At this point, go ahead and commit the right-pointing gray-arrowed files by right-clicking on them (or their containing folder) and choosing Commit....
    • If you see files that you did not consciously edit being marked for commit, check with Dr. Dionisio first. These may be build products or local settings files, neither of which should be committed.
  7. Verify that the files you want to commit are checked in the ensuing Commit Changes dialog.
  8. Just like with the wiki, it is good developer etiquette to describe briefly the nature of the changes that you are committing.
  9. When you are ready, you may choose either Commit and Push or just Commit.
    • Commit will preserve a version of these files, but not yet update the GitHub website. This allows you to keep on working without others being affected by what you do (yet).
    • Commit and Push will both save a version of your files and send the latest changes to GitHub. Your changes will then be visible to others.
    • You may Push as a separate step; this is available either from the Team right-click menu or the toolbar of the Synchronize tab.
  10. Even if you have nothing to commit, it is still a good idea to invoke Team Synchronize... regularly so that you are kept up-to-date with regard to files that others may be committing.

Creating a Distribution

To create your own version of GenMAPP Builder based on the code you have in Eclipse (which may contain some new changes/customizations that you would like to test), follow these steps:

  1. Switch to Eclipse’s Java perspective.
  2. Edit the GenMAPPBuilder.java source code to identify the distribution that you are about to create by setting the VERSION string (located at approximately line 83) to a sufficiently descriptive value.
  3. Open the gmbuilder project by clicking on the gray triangle to the left of its name.
  4. Within the gmbuilder Java project is a file called build.xml. It should have an icon that appears to include an ant.
  5. Right-click on build.xml and choose Run As > Ant Build... (the one with the ellipses) from the menu that appears.
  6. In the Edit Configuration dialog that appears, uncheck dist if it is already checked.
  7. Check on the clean and dist items in the Targets tab. The Target execution order section near the bottom of the dialog should say clean, dist.
  8. Click the Run button. The computer will work for a bit. You will see some messages scroll up on the Console tab if that tab is visible.
  9. When it is done, right-click on the gmbuilder project folder and choose Refresh from the menu that appears (F5 is its keyboard shortcut).
  10. You should see a dist folder appear inside the gmbuilder project folder.
  11. This is your personally-built copy of GenMAPP Builder. Its contents correspond to the extracted contents of the gmbuilder-3.0.0-build-5.zip file that was downloaded in class.
  12. Run PostgreSQL (pgAdmin III) and start a database, then run this copy of GenMAPP Builder as you would the “released” copy. The program should behave just like the one that you downloaded and have been using.
Gene Database Project Links
Overview Deliverables Reference Format Guilds Project Manager GenMAPP User Quality Assurance Coder
Teams Heavy Metal HaterZ The Class Whoopers GÉNialOMICS Oregon Trail Survivors