Upgrade to Pro — share decks privately, control downloads, hide ads and more …

Leaving Behind the Software History When Transi...

Leaving Behind the Software History When Transitioning to Open-Source: Reasons and Implications

Gustavo Pinto

June 11, 2018
Tweet

More Decks by Gustavo Pinto

Other Decks in Technology

Transcript

  1. Leaving Behind the Software History When Transitioning to Open-Source: Reasons

    and Implications @gustavopinto @igorsteinmacher @gerosa_marco
  2. Mihai Codoban, Sruti Srinivasa Ragavan, Danny Dig, and Brian Bailey.

    Software history under the lens: A study on why and how developers examine it. In ICSME 2015, pages 1–10, 2015. “Software history is indispensable for developers. Of the 217 developers surveyed in this work, 85% find software history important to their development activities and 61% need to refer to history at least several times a day.” More benefits: Knowledge acquisition (Pham et al., 2013) End-users take advantage of the software history (Kuttal et al., 2014) Research (co-changes, defect prediction, mining, etc.)
  3. We found 50 proprietary projects that made the shift to

    open source and deleted the history We could find only 8 projects that kept the history
  4. We found 50 proprietary projects that made the shift to

    open source 1. Why did you decide not to keep the software history?
  5. We found 50 proprietary projects that made the shift to

    open source 2. Do the core developers face any kind of problems with the lack of software history?
  6. We found 50 proprietary projects that made the shift to

    open source 3. Do the newcomers face any kind of problems with the lack of software history?
  7. We found 50 proprietary projects that made the shift to

    open source 4. How does the lack of software history impacted software evolution?
  8. We found 50 proprietary projects that made the shift to

    open source 15 did not answered our inquiries 41 answers in total
  9. We found 50 proprietary projects that made the shift to

    open source 15 did not answered our inquiries
  10. We found 50 proprietary projects that made the shift to

    open source 15 did not answered our inquiries
  11. We found 50 proprietary projects that made the shift to

    open source 15 did not answered our inquiries
  12. We found 50 proprietary projects that made the shift to

    open source 15 did not answered our inquiries
  13. We found 50 proprietary projects that made the shift to

    open source 15 did not answered our inquiries
  14. RQ1. Why some projects do not open the software history?

    Extracting just the subfolder would have been difficult, and older versions would not have built First get something working, and then disentangle it from your own proprietary code, configuration, etc. Entangled with proprietary code Contains sensitive information Housekeeping needed License and legal reasons
  15. RQ1. Why some projects do not open the software history?

    The earliest commits may contain information we cannot share, so upon releasing we squashed the history Going through thousands of commits means no one will take on the heroic task of even open- sourcing the product Entangled with proprietary code Contains sensitive information Housekeeping needed License and legal reasons
  16. RQ1. Why some projects do not open the software history?

    We cleaned embarrassing or inappropriate comments, brought the code up to OSS standards … Entangled with proprietary code Contains sensitive information Housekeeping needed License and legal reasons
  17. RQ1. Why some projects do not open the software history?

    Made it much easier to get the lawyers at our parent company to agree to open source it Instead of reviewing the entire history, they could review just the current state Entangled with proprietary code Contains sensitive information Housekeeping needed License and legal reasons
  18. RQ2. What are the challenges to deal with a history

    free project? None of the core developers has wanted or needed to go look back through the history Communication, documentation, and idiomatic expressions of the Python code are sufficient to maintain project coherency
  19. RQ2. What are the challenges to deal with a history

    free project? We still use the non-git system internally and can refer to history if we need to I’m probably the person most likely to access it, and I’d estimate that I use it only a few times per year
  20. RQ2. What are the challenges to deal with a history

    free project? For a fast-moving project, history from more than half a year ago is not particularly valuable for development I am not aware of any problems for newcomers The lack of software history does not greatly impact software evolution and understanding
  21. Open challenges How to design tools to leverage and visualize

    the software history? How to improve tools to migrate code between repositories and to disentangle source code How to find sensitive information in the software history? How to estimate the cost of releasing the history? When do developers need to understand the software history?
  22. Leaving Behind the Software History When Transitioning to Open-Source: Reasons

    and Implications @gustavopinto @igorsteinmacher @gerosa_marco