Many companies, as well as open-source projects, use Version Control Systems (VCS) like CVS, SubVersion, ClearCase and Git to record the history of source-code files. These VCS record the whole development and the corresponding change activities applied on the source code. Furthermore, these versioning systems are often associated with change management systems which allow the users to report bugs and request new features. Popular change management systems include repositories like Bugzilla and JIRA. These systems record the entire evolution of a software system.
The highly active Mining Software Repositories (see MSR conference) community shows that the evolutionary information implicetly stored in these systems may provide valuable insight into how the software is being developed. The evolution of a source-code file over time may explain better why previous design choices have been made in the file. The first step to analyse the evolutionary information of a software system consists of extracting the necessary information from the version control system or change management system.
To explore some interesting work that has been explored in the MSR field, you can refer to this link. Some examples of interesting studies include:
This lab session serves as a first contact with this newly arrising domain. As such you will learn the kind of knowledge that can be extracted from a software repository. In order to gain this understanding, we will be using an initial prototype.
Case Study: JabRef
For this exercise we will use JabRef. Open the shell and clone the project. Use git log to explore the project's evolution.
Query the Versioning System
First, we will use simple queries applied on the version control system to identify interesting knowledge about the software system at hand. Study the information provided by the extracted Git log (download the file here: Git.log)
Using git log and grep we can extract several information:
First Contact Visualizations
Previous analysis highlights that the repository may provide relevant information related to the evolution of the system. However, writing queries to understand a software system might be quite daunting especially during your first contact with a system. Hence we will introduce a simple visualization of the versioning system: EGIT.
Launch Eclipse. EGIT is a Eclipse's plugin create to support Git repository. Follow the manual and answer the following questions:
EGIT is only one of the tools able to presents in a convenient view the information reported in software repository. There are many others - often developed by the MSR community (here an update list)- that explore the software evolution. Each one is characterezed by the the repository handled (e.g. git) and information extracted. Here few examples:
We will use gource to visualize the evolution of a project. More specifically, which developers worked on which files and when. It can be simply run from the command line using the "gource" command with the project root folder as its argument.