nwb_project_analytics.gitstats module
Module for querying GitHub repos
- class nwb_project_analytics.gitstats.GitHubRepoInfo(repo)
Bases:
object
Helper class to get information about a repo from GitHub
- Variables:
repo – a GitRepo tuple with the owner and name of the repo
- static collect_all_release_names_and_date(repos: dict, cache_dir: str, read_cache: bool = True, write_cache: bool = True)
- get_release_names_and_dates(**kwargs)
Get names and dates of releases :param kwargs: Additional keyword arguments to be passed to self.get_releases
- Returns:
Tuple with the list of names as strings and the list of dates as datetime objects
- get_releases(use_cache=True)
Get the last 100 release for the given repo
- NOTE: GitHub uses pageination. Here we set the number of items per page to 100
which should usually fit all releases, but in the future we may need to iterate over pages to get all the releases not just the latest 100. Possible implementation https://gist.github.com/victorbordo/5581fdfb89ed93bf3eb2b478529b9e38
- Parameters:
use_cache – If set to True then return the chached results if computed previously. In this case the per_page parameter will be ignored
- Raises:
Error if response is not Ok, e.g., if the GitHub request limit is exceeded.
- Returns:
List of dicts with the release data
- static get_version_jump_from_tags(tags)
Assuming semantic versioning release tags get the version jumps from the tags
- Returns:
OrderedDict
- class nwb_project_analytics.gitstats.GitRepo(owner: str, repo: str, mainbranch: str, docs: str | None = None, logo: str | None = None, startdate: datetime | None = None)
Bases:
tuple
Named tuple with basic information about a GitHub repository
- static compute_issue_time_of_first_response(issue)
For a given GitHub issue compute the time to first respone based on the the issue’s timeline
- get_commits_as_dataframe(since, github_obj, tqdm)
Get a dataframe for all commits with updates later than the given data
- Parameters:
since – Datetime object with the date of the oldest issue to retrieve
github_obj – PyGitHub github.Github object to use for retrieving issues
tqdm – Supply the tqdm progress bar class to use
- Returns:
Pandas DataFrame with the commits data
- get_issues_as_dataframe(since, github_obj, tqdm=None)
Get a dataframe for all issues with updates later than the given data
- Parameters:
since – Datetime object with the date of the oldest issue to retrieve
github_obj – PyGitHub github.Github object to use for retrieving issues
tqdm – Supply the tqdm progress bar class to use
- Returns:
Pandas DataFrame with the issue data
- property github_issues_url
URL for GitHub issues page
- property github_path
https path for the git repo
- property github_pulls_url
URL for GitHub pull requests page
- class nwb_project_analytics.gitstats.GitRepos(*arg, **kw)
Bases:
OrderedDict
Dict where the keys are names of codes and the values are GitRepo objects
- get_info_objects()
Get an OrderedDict of GitHubRepoInfo object from the repos
- static merge(o1, o2)
Merge two GitRepo dicts and return a new GitRepos dict with the combined items
- class nwb_project_analytics.gitstats.IssueLabel(label: str, description: str, color: str)
Bases:
tuple
Named tuple describing a label for issues on a Git repository.
- label: str
Label of the issue, usually consisting <type>: <level>. <type> indicates the general area the label is used for, e.g., to assign a category, priority, or topic to an issue. <level> then indicates importance or sub-category with the given <type>, e.g., critical, high, medium, low level as part of the priority type
- property level
Get the level of the issue, indicating the importance or sub-category of the label within the given self.type, e.g., critical, high, medium, low level as part of the priority type.
- Returns:
str with the level or None in case the label does not have a level (e.g., if the label does not contain a “:” to separate the type and level.
- property rgb
Color code converted to RGB
- Returns:
Tuple of ints with (red, green, blue) color values
- property type
Get the type of the issue label indicating the general area the label is used for, e.g., to assign a category, priority, or topic to an issue.
- Returns:
str with the type or None in case the label does not have a category (i.e., if the label does not contain a “:” to separate the type and level).
- class nwb_project_analytics.gitstats.IssueLabels(*arg, **kw)
Bases:
OrderedDict
OrderedDict where the keys are names of issues labels and the values are IssueLabel objects
- property colors
Get a list of all color hex codes uses
- get_by_type(label_type)
Get a new IssueLabels dict with just the lables with the given category
- property levels
Get a list of all level strings used in labels (may include Node)
- static merge(o1, o2)
Merger two IssueLabels dicts and return a new IssuesLabels dict with the combined items
- property rgbs
Get a list of all rgb color codes used
- property types
Get a list of all type strings used in labels (may include None)
- class nwb_project_analytics.gitstats.NWBGitInfo
Bases:
object
Class for storing basic information about NWB repositories
- class property CORE_API_REPOS
Dictionary with the main NWB git repos related the user APIs.
- CORE_DEVELOPERS = ['rly', 'bendichter', 'oruebel', 'ajtritt', 'ln-vidrio', 'mavaylon1', 'CodyCBakerPhD', 'stephprince', 'lawrence-mbf', 'dependabot[bot]', 'nwb-bot', 'hdmf-bot', 'pre-commit-ci[bot]']
List of names of the core developers of NWB overall. These are used, e.g., when analyzing issue stats as core developer issues should not count against user issues.
- GIT_REPOS = {'HDMF': GitRepo(owner='hdmf-dev', repo='hdmf', mainbranch='dev', docs='https://hdmf.readthedocs.io', logo='https://raw.githubusercontent.com/hdmf-dev/hdmf/dev/docs/source/hdmf_logo.png', startdate=datetime.datetime(2019, 3, 13, 0, 0)), 'HDMF_Common_Schema': GitRepo(owner='hdmf-dev', repo='hdmf-common-schema', mainbranch='main', docs='https://hdmf-common-schema.readthedocs.io', logo=None, startdate=None), 'HDMF_DocUtils': GitRepo(owner='hdmf-dev', repo='hdmf-docutils', mainbranch='main', docs=None, logo=None, startdate=None), 'HDMF_Schema_Language': GitRepo(owner='hdmf-dev', repo='hdmf-schema-language', mainbranch='main', docs='https://hdmf-schema-language.readthedocs.io/', logo=None, startdate=None), 'HDMF_Zarr': GitRepo(owner='hdmf-dev', repo='hdmf-zarr', mainbranch='dev', docs='https://hdmf-zarr.readthedocs.io', logo='https://raw.githubusercontent.com/hdmf-dev/hdmf-zarr/dev/docs/source/figures/logo_hdmf_zarr.png', startdate=None), 'Hackathons': GitRepo(owner='NeurodataWithoutBorders', repo='nwb_hackathons', mainbranch='main', docs='https://neurodatawithoutborders.github.io/nwb_hackathons/', logo=None, startdate=None), 'MatNWB': GitRepo(owner='NeurodataWithoutBorders', repo='matnwb', mainbranch='master', docs='https://neurodatawithoutborders.github.io/matnwb/', logo='https://raw.githubusercontent.com/NeurodataWithoutBorders/matnwb/master/logo/logo_matnwb.png', startdate=None), 'NDX_Catalog': GitRepo(owner='nwb-extensions', repo='nwb-extensions.github.io', mainbranch='main', docs='https://nwb-extensions.github.io/', logo='https://github.com/nwb-extensions/nwb-extensions.github.io/blob/main/images/ndx-logo-text.png', startdate=None), 'NDX_Extension_Smithy': GitRepo(owner='nwb-extensions', repo='nwb-extensions-smithy', mainbranch='master', docs=None, logo=None, startdate=datetime.datetime(2019, 4, 25, 0, 0)), 'NDX_Staged_Extensions': GitRepo(owner='nwb-extensions', repo='staged-extensions', mainbranch='master', docs=None, logo=None, startdate=None), 'NDX_Template': GitRepo(owner='nwb-extensions', repo='ndx-template', mainbranch='main', docs='https://nwb-overview.readthedocs.io/en/latest/extensions_tutorial/2_create_extension_spec_walkthrough.html', logo=None, startdate=None), 'NWBInspector': GitRepo(owner='NeurodataWithoutBorders', repo='nwbinspector', mainbranch='dev', docs='https://nwbinspector.readthedocs.io', logo='https://raw.githubusercontent.com/NeurodataWithoutBorders/nwbinspector/dev/docs/logo/logo.png', startdate=None), 'NWBWidgets': GitRepo(owner='NeurodataWithoutBorders', repo='nwb-jupyter-widgets', mainbranch='master', docs=None, logo='https://user-images.githubusercontent.com/844306/254117081-f20b8c26-79c7-4c1c-a3b5-b49ecf8cce5d.png', startdate=None), 'NWB_Benchmarks': GitRepo(owner='NeurodataWithoutBorders', repo='nwb_benchmarks', mainbranch='main', docs=None, logo=None, startdate=None), 'NWB_GUIDE': GitRepo(owner='NeurodataWithoutBorders', repo='nwb-guide', mainbranch='main', docs='https://github.com/NeurodataWithoutBorders/nwb-guide', logo='https://raw.githubusercontent.com/NeurodataWithoutBorders/nwb-guide/main/src/renderer/assets/img/logo-guide-draft-transparent-tight.png', startdate=datetime.datetime(2022, 11, 21, 0, 0)), 'NWB_Overview': GitRepo(owner='NeurodataWithoutBorders', repo='nwb-overview', mainbranch='main', docs='https://nwb-overview.readthedocs.io', logo=None, startdate=None), 'NWB_Project_Analytics': GitRepo(owner='NeurodataWithoutBorders', repo='nwb-project-analytics', mainbranch='main', docs='https://github.com/NeurodataWithoutBorders/nwb-project-analytics', logo=None, startdate=None), 'NWB_Schema': GitRepo(owner='NeurodataWithoutBorders', repo='nwb-schema', mainbranch='dev', docs='https://nwb-schema.readthedocs.io', logo=None, startdate=None), 'NWB_Schema_Language': GitRepo(owner='NeurodataWithoutBorders', repo='nwb-schema-language', mainbranch='main', docs='https://schema-language.readthedocs.io', logo=None, startdate=None), 'NeuroConv': GitRepo(owner='catalystneuro', repo='neuroconv', mainbranch='main', docs='https://neuroconv.readthedocs.io', logo='https://github.com/catalystneuro/neuroconv/blob/main/docs/img/neuroconv_logo.png', startdate=None), 'PyNWB': GitRepo(owner='NeurodataWithoutBorders', repo='pynwb', mainbranch='dev', docs='https://pynwb.readthedocs.io', logo='https://raw.githubusercontent.com/NeurodataWithoutBorders/pynwb/dev/docs/source/figures/logo_pynwb.png', startdate=None)}
Dictionary with main NWB git repositories. The values are GitRepo tuples with the owner and repo name.
- HDMF_START_DATE = datetime.datetime(2019, 3, 13, 0, 0)
HDMF was originally part of PyNWB. As such code statistics before this start date for HDMF reflect stats that include both PyNWB and HDMF and will result in duplicate counting of code stats if PyNWB and HDMF are shown together. For HDMF 2019-03-13 coincides with the removal of HDMF from PyNWB with PR #850 and the release of HDMF 1.0. For the plotting 2019-03-13 is therefore a good date to start considering HDMF stats to avoid duplication of code in statistics, even though the HDMF repo existed on GitHub already since 2019-01-23T23:48:27Z, which could be alternatively considered as the start date. Older dates will include code history carried over from PyNWB to HDMF. Set to None to consider the full history of HMDF but as mentioned, this will lead to some duplicate counting of code before 2019-03-13
- MISSING_RELEASE_TAGS = {'MatNWB': [('0.1.0b', datetime.datetime(2017, 11, 11, 0, 0))], 'NWB_Schema': [('2.0.0', datetime.datetime(2019, 1, 19, 0, 0)), ('2.0.0b', datetime.datetime(2017, 11, 11, 0, 0))]}
List of early releases that are missing a tag on GitHub
- NWB1_DEPRECATION_DATE = datetime.datetime(2016, 8, 1, 0, 0)
Date when to declare the NWB 1.0 APIs as deprecated. The 3rd Hackathon was held on July 31 to August 1, 2017 at Janelia Farm, in Ashburn, Virginia, which marks the date when NWB 2.0 was officially accepted as the follow-up to NWB 1.0. NWB 1.0 as a project ended about 1 year before that.
- NWB1_GIT_REPOS = {'NWB_1.x_Matlab': GitRepo(owner='NeurodataWithoutBorders', repo='api-matlab', mainbranch='dev', docs=None, logo=None, startdate=None), 'NWB_1.x_Python': GitRepo(owner='NeurodataWithoutBorders', repo='api-python', mainbranch='dev', docs=None, logo=None, startdate=None)}
Dictionary with main NWB 1.x git repositories. The values are GitRepo tuples with the owner and repo name.
- NWB2_BETA_RELEASE = datetime.datetime(2017, 11, 11, 0, 0)
Date of the first official beta release of NWB 2 as part of SfN 2017
- NWB2_FIRST_STABLE_RELEASE = datetime.datetime(2019, 1, 19, 0, 0)
Date of the first official stable release of NWB 2.0
- NWB2_START_DATE = datetime.datetime(2016, 8, 31, 0, 0)
Date of the first release of PyNWB on the NWB GitHub. While some initial work was ongoing before that date, this was the first public release of code related to NWB 2.x
- NWB_EXTENSION_SMITHY_START_DATE = datetime.datetime(2019, 4, 25, 0, 0)
NWB_Extension_Smithy is a fork with changes. We therefore should count only the sizes after the fork data which based on https://api.github.com/repos/nwb-extensions/nwb-extensions-smithy is 2019-04-25T20:56:02Z
- NWB_GUIDE_START_DATE = datetime.datetime(2022, 11, 21, 0, 0)
NWB GUIDE was forked from SODA so we want to start tracking stats starting from that date
- STANDARD_ISSUE_LABELS = {'category: bug': IssueLabel(label='category: bug', description='errors in the code or code behavior', color='#ee0701'), 'category: enhancement': IssueLabel(label='category: enhancement', description='improvements of code or code behavior', color='#1D76DB'), 'category: proposal': IssueLabel(label='category: proposal', description='discussion of proposed enhancements or new features', color='#dddddd'), 'compatibility: breaking change': IssueLabel(label='compatibility: breaking change', description='fixes or enhancements that will break schema or API compatibility', color='#B24AD1'), 'help wanted: deep dive': IssueLabel(label='help wanted: deep dive', description='request for community contributions that will involve many parts of the code base', color='#0E8A16'), 'help wanted: good first issue': IssueLabel(label='help wanted: good first issue', description='request for community contributions that are good for new contributors', color='#0E8A16'), 'priority: critical': IssueLabel(label='priority: critical', description='impacts proper operation or use of core function of NWB or the software', color='#a0140c'), 'priority: high': IssueLabel(label='priority: high', description='impacts proper operation or use of feature important to most users', color='#D93F0B'), 'priority: low': IssueLabel(label='priority: low', description='alternative solution already working and/or relevant to only specific user(s)', color='#FEF2C0'), 'priority: medium': IssueLabel(label='priority: medium', description='non-critical problem and/or affecting only a small set of NWB users', color='#FBCA04'), 'priority: wontfix': IssueLabel(label='priority: wontfix', description='will not be fixed due to low priority and/or conflict with other feature/priority', color='#ffffff'), 'topic: docs': IssueLabel(label='topic: docs', description='Issues related to documentation', color='#D4C5F9'), 'topic: testing': IssueLabel(label='topic: testing', description='Issues related to testing', color='#D4C5F9')}