ODH Logo

Getting Started

Please use the following outline to get started with a new Data Science Project. If you run into any problems or need additional support, please join our slack and post a message in the #support channel, or open an issue here.

Define a Project Document

  1. Review our project workflow document here.
  2. Create a new project description using our template. For reference, you can review an example project document here.
  3. Put your project document into a subfolder of this shared directory.

Create Git Repository from a Template

Next, create a CookieCutter formatted data science project repository from this template. To do so, please follow these instructions for creating a new repository from a template.

Set Up AICoE Continuous Integration (CI)

AICoE-CI can aid various aspects of a development workflow such as running pre-commit checks, build checks, triggering pipelines to build images, etc.

For more information on AICoE CI and how to set it up for your project, you can follow the documentation here.

Create Project Boards and Issues

You can create Issues to highlight new features, bugs and break upcoming goals of a project into smaller chunks. You can create GitHub Issues and include Task Lists for each task that you plan to work on.

Here is an example Issue that we created to highlight a certain task for our project.

We use project boards to organize the tasks within the team.

Once your project board is created, begin to create issues and populate the board with them

  • Create an issue for every task you plan to work on

    • Break issues down into smaller pieces if they cannot be completed in a single sprint
    • Be specific in the description of the issue
    • Tag other related issues with a ‘#’ as necessary
    • Provide acceptance criteria for when the issue can be closed
    • Assign yourself or your teammates who you plan to work with on the task to the issue
  • Add the issues to the New column of you project board
  • When you’re almost ready to begin working on an issue, move it to the To Do column
  • Once you start work on an issue, move it to the In Progress column

For more information of the remainder of our GitHub workflow, please visit this page.

Develop on Operate First Jupyterhub

The following resources are available for use to develop your project on the Operate First JupyterHub instance:

  • Start developing on the Operate First JupyterHub environment

    • View the video tutorial
    • Access the public JupyterHub instance on the Massachusetts Open Cloud (MOC) by selecting operate-first authentication and then signing in with your GitHub account
  • How to Monitor JupyterHub workloads using Grafana dashboards

    • (Video tutorial forthcoming)
    • To troubleshoot issues while running your notebooks, you can refer to the Troubleshooting Runbooks
    • You can refer the existing Grafana dashboard for monitoring your Jupyterhub workloads such as CPU, RAM usage, percentage PVC usage.
  • How to Meteorize the repository

    • To meteorize your github repository, first start with making a jupyterbook. You can refer to the tutorial here.
    • After you have the jupyterbook ready, head to meteor to meteorize the project repository.

Managing Dependencies

Guidance for managing notebook dependencies can be found in this video. The repository for our dependency management tool can be found here.

Deliverables

Here are potential ways to contribute and showcase work you have completed.

  • Slide decks
  • Blog posts (example)
  • Videos (example)
  • Tutorials (example)
  • Content images
  • Project Repositories (example)
  • Meteorized Repositories (example)
  • Dashboards
  • Model Services

Creating Jupyterhub Images

JH images can be made from your projects to easily share reproducible ML experiments and create interactive Data Science working environments. Please refer to the following guide for more information.

Creating Automated Model Workflows

To automate your notebook workflows using Elyra and Kubeflow Pipelines on Open Data Hub, you can refer to the guide and view our video tutorial.