Monday, December 21, 2020

DevOps | Automation | Migration :: Subversion (SVN) to GitHub Enterprise Migration with 500GB+ data and 50,000+ Commits.

 

Subversion is a Centralized Version Control system and GitHub Enterprise is distributed version control system.

In one of the project I recently lead the team for migrating code-base with over 10+yrs old application code-base of 500+GB data of telecom software application, including commit history from legacy SVN to Enterprise grade GitHub. Supported in validating the workflow with muti-project dependencies for over 46+ components successfully.Some of features used are described briefly below:-


SVN:


  1. SVN Externals was handling multi-module project dependencies in Subversion.
  2. All the artifacts of the application code base were directly stored in Subversion server.


GitHub Enterprise:


  1. GitHub was used to version control code/config or human readable file format. The maximum file size it supported was 100MB as de facto standard.
  2. Binaries/Artifacts were version controlled using GitLFS (Long File Storage)
  3. Multi-project dependencies were handled by Git Sub-module


Bridging of SVN and GitHub:


  1. git-svn utility was used to convert the SVN revision no based commits to GitHub Commit SHA-IDs
  2. Shell Script was used to handle preparation of authors file , import of SVN data and upload to GitHub
  3. GitHub API integration with Terraform were used for handling org/repo/team/GitHub branch protection.
  4. "git svn rebase" operation can be used to migrate the delta changes over a period of time.
  5. Jenkins Automation Server configuration to migrate SVN jobs to GitHub.


Developer Validation:


  1. sourcetree from Atlassian is best suited tool for - integrating GitHub/GitLFS/Git Submodule as a one stop solution.
  2. Sourcetree tool supports GitHub , GitHub Enterprise , GitLab and Bit Bucket

I hope the post will be useful for folks working in DevOps and Managing the software version control using SVN and if you have any plan in pipeline to migrate to GitHub Enterprise.


Challenges:


  1. Migration from SVN to GitHub was not free from challenges, we had quite a bunch in managing SVN index data generated from git-svn utility.
  2. We had commit history ranging from 100 to 50,000+ commits and it took days to migrate a specific telecom product releases.
  3. git-filter branch utility was used to clear index data reference, matching false positive files which blocked the migration or push operation to GitHub Enterprise repositories. 

 

I would like to hear your experience if you have been going through any issues in your SVN migration activity  and love to support if you have face any challenges.

 

References:

  1. https://www.sourcetreeapp.com/  - Free Sourcetree tool
  2. https://www.atlassian.com/git/tutorials/migrating-convert - git-svn utility
  3. https://www.hashicorp.com/blog/managing-github-with-terraform - GitHub API and Terraform onboarding