# ghorg
[](https://goreportcard.com/report/github.com/gabrie30/ghorg)
[](https://github.com/avelino/awesome-go) [](https://opensource.org/licenses/Apache-2.0) [](https://docs.wakemeops.com//packages/ghorg)
Pronounced [gore-guh]; similar to [gorge](https://www.dictionary.com/browse/gorge). You can use ghorg to gorge on orgs.
Use ghorg to quickly clone all of an orgs, or users repos into a single directory. This can be useful in many situations including
1. Searching an orgs/users codebase with ack, silver searcher, grep etc..
1. Bash scripting
1. Creating backups
1. Onboarding new team members (cloning all team repos)
1. Performing Audits
> With default configuration ghorg performs two actions.
> 1. Will clone a repo if its not inside the clone directory.
> 2. If repo does exists locally in the clone directory it will perform a git pull and git clean on the repo.
> So when running ghorg a second time on the same org/user, all local changes in the cloned directory by default will be overwritten by what's on GitHub. If you want to work out of this directory, make sure you either rename the directory or set the `--no-clean` flag on all future clones to prevent losing your changes locally.
## Supported Providers
- GitHub (Self Hosted & Cloud)
- [Install](https://github.com/gabrie30/ghorg#installation) | [Setup](https://github.com/gabrie30/ghorg#github-setup) | [Examples](https://github.com/gabrie30/ghorg/blob/master/examples/github.md)
- GitLab (Self Hosted & Cloud)
- [Install](https://github.com/gabrie30/ghorg#installation) | [Setup](https://github.com/gabrie30/ghorg#gitlab-setup) | [Examples](https://github.com/gabrie30/ghorg/blob/master/examples/gitlab.md)
- Bitbucket (Cloud & Self-hosted Server)
- [Install](https://github.com/gabrie30/ghorg#installation) | [Setup](https://github.com/gabrie30/ghorg#bitbucket-setup) | [Examples](https://github.com/gabrie30/ghorg/blob/master/examples/bitbucket.md)
- Gitea (Self Hosted Only)
- [Install](https://github.com/gabrie30/ghorg#installation) | [Setup](https://github.com/gabrie30/ghorg#gitea-setup) | [Examples](https://github.com/gabrie30/ghorg/blob/master/examples/gitea.md)
- Sourcehut (Limited Features)
- [Install](https://github.com/gabrie30/ghorg#installation) | [Setup](https://github.com/gabrie30/ghorg#sourcehut-setup) | [Examples](https://github.com/gabrie30/ghorg/blob/master/examples/sourcehut.md)
> The terminology used in ghorg is that of GitHub, mainly orgs/repos. GitLab and BitBucket use different terminology. There is a handy chart thanks to GitLab that translates terminology [here](https://about.gitlab.com/images/blogimages/gitlab-terminology.png). Note, some features may be different for certain providers.
## High Level Features
- [Filter](#selective-repository-cloning) or select specific repositories for cloning
- Create [backups](#creating-backups) of repositories
- Simplify complex clone commands using [reclone](#reclone-command) shortcuts
- Initiate clone operations via [HTTP server](https://github.com/gabrie30/ghorg/blob/master/examples/reclone-server.md)
- Schedule cloning tasks using [cron](https://github.com/gabrie30/ghorg/blob/master/examples/reclone-cron.md)
- Monitor and track clone [metrics](#tracking-clone-data-over-time) over time
## Installation
There are a installation methods available, please choose the one that suits your fancy:
- [Prebuilt Binaries](#prebuilt-binaries)
- [Homebrew](#homebrew)
- [Mise](#mise)
- [Golang](#golang)
- [Docker](#docker)
- [Windows Support](#windows-support)
For each installation method, optionally create a ghorg configuration file. See the [configuration](#configuration) section for more details.
```bash
mkdir -p $HOME/.config/ghorg
curl https://raw.githubusercontent.com/gabrie30/ghorg/master/sample-conf.yaml > $HOME/.config/ghorg/conf.yaml
vi $HOME/.config/ghorg/conf.yaml # To update your configuration
```
### Prebuilt Binaries
See [latest release](https://github.com/gabrie30/ghorg/releases/latest) to download directly for
- Mac (Darwin)
- Windows
- Linux
If you don't know which to choose its likely going to be the x86_64 version for your operating system.
### Homebrew
```bash
brew install ghorg
```
### Mise
If you are an enthusiast user of [Mise](https://github.com/jdx/mise), the polyglot tool versions manager, you can use such command to install the latest version of `ghorg` on Linux/MacOS/Windows:
```bash
mise use -g ghorg@latest
```
### Golang
```bash
# ensure $HOME/go/bin is in your path ($ echo $PATH | grep $HOME/go/bin)
# if using go 1.16+ locally
go install github.com/gabrie30/ghorg@latest
# older go versions can run
go get github.com/gabrie30/ghorg
```
## Configuration
Precedence for configuration is first given to the flags set on the command-line, then to what's set in your `$HOME/.config/ghorg/conf.yaml`. This file comes from the [sample-conf.yaml](https://github.com/gabrie30/ghorg/blob/master/sample-conf.yaml) and can be installed by performing the following.
```bash
mkdir -p $HOME/.config/ghorg
curl https://raw.githubusercontent.com/gabrie30/ghorg/master/sample-conf.yaml > $HOME/.config/ghorg/conf.yaml
vi $HOME/.config/ghorg/conf.yaml # To update your configuration
```
If no configuration file is found ghorg will use its defaults and try to clone a GitHub Org, however an api token is always required.
You can have multiple configuration files which is useful if you clone from multiple SCM providers with different tokens and settings. Alternative configuration files can only be referenced as a command-line flag `--config`.
If you have multiple different orgs/users/configurations to clone see the `ghorg reclone` command as a way to manage them.
Note: ghorg will respect the `XDG_CONFIG_HOME` [environment variable](https://wiki.archlinux.org/title/XDG_Base_Directory) if set.
## SCM Provider Setup
> Note: if you are running into issues, read the troubleshooting and known issues section below
### GitHub Setup
1. Create [Personal Access Token](https://help.github.com/en/github/authenticating-to-github/creating-a-personal-access-token-for-the-command-line) with all `repo` scopes. Update `GHORG_GITHUB_TOKEN` in your `ghorg/conf.yaml` or as a cli flag or place it in a file and add the path to `GHORG_GITHUB_TOKEN`. If your org has Saml SSO in front you will need to give your token those permissions as well, see [this doc](https://docs.github.com/en/github/authenticating-to-github/authenticating-with-saml-single-sign-on/authorizing-a-personal-access-token-for-use-with-saml-single-sign-on).
1. For cloning GitHub Enterprise (self hosted github instances) repos you must set `--base-url` e.g. `ghorg clone --base-url=https://internal.github.com`
1. See [examples/github.md](https://github.com/gabrie30/ghorg/blob/master/examples/github.md) on how to run
#### GitHub App Authentication (Advanced)
1. [Create a GitHub App](https://docs.github.com/en/apps/creating-github-apps/setting-up-a-github-app/creating-a-github-app) in your Organization. You only need to fill out the required fields. Make sure to give Repository Permissions -> contents -> read only permissions
1. Install the GitHub App into your Organization
1. Generate a a private key from the GitHub App, set the location of the key to `GHORG_GITHUB_APP_PEM_PATH`
1. Locate the GitHub App ID from the GitHub App, set the value to `GHORG_GITHUB_APP_ID`
1. Locate the GitHub Installation ID from the URL of the GitHub app, set the value to `GHORG_GITHUB_APP_INSTALLATION_ID`. NOTE: you will need to use the actual GitHub url to get this ID, go to your GitHub Organization Settings Page -> Third Party Access -> GitHub Apps -> Configure -> Get ID from URL
### GitLab Setup
1. Create [Personal Access Token](https://docs.gitlab.com/ee/user/profile/personal_access_tokens.html) with the `read_api` scope (or `api` for self-managed GitLab older than 12.10). This token can be added to your `ghorg/conf.yaml` or as a cli flag.
1. Update the `GitLab Specific` config in your `ghorg/conf.yaml` or via cli flags or place it in a file and add the path to `GHORG_GITLAB_TOKEN`
1. Update `GHORG_SCM_TYPE` to `gitlab` in your `ghorg/conf.yaml` or via cli flags
1. See [examples/gitlab.md](https://github.com/gabrie30/ghorg/blob/master/examples/gitlab.md) on how to run
### Gitea Setup
1. Create [Access Token](https://docs.gitea.io/en-us/api-usage/) (Settings -> Applications -> Generate Token)
1. Update `GHORG_GITEA_TOKEN` in your `ghorg/conf.yaml` or use the (--token, -t) flag or place it in a file and add the path to `GHORG_GITEA_TOKEN`.
1. Update `GHORG_SCM_TYPE` to `gitea` in your `ghorg/conf.yaml` or via cli flags
1. See [examples/gitea.md](https://github.com/gabrie30/ghorg/blob/master/examples/gitea.md) on how to run
### Sourcehut Setup
1. Create a [Personal Access Token](https://meta.sr.ht/oauth2). Click "Limit scope of access grant", check "Generate read-only access token", then ctrl-click the REPOSITORIES and OBJECTS permissions.
1. Ensure you have added an SSH key if you want to clone private repos (sourcehut does not accept PATs in https URLs)
1. Update `GHORG_SOURCEHUT_TOKEN` in your `ghorg/conf.yaml` or use the (--token, -t) flag or place it in a file and add the path to `GHORG_SOURCEHUT_TOKEN`.
1. Update `GHORG_SCM_TYPE` to `sourcehut` in your `ghorg/conf.yaml` or via cli flags
> **Note on usernames**: You can specify sourcehut usernames with or without the `~` prefix (e.g., both `ghorg clone username` and `ghorg clone ~username` work). Local folder paths will never include the `~` prefix to avoid shell expansion issues.
> **For detailed examples, API limitations, and sourcehut-specific features, see [examples/sourcehut.md](https://github.com/gabrie30/ghorg/blob/master/examples/sourcehut.md)**
### Bitbucket Setup
> Note: ghorg supports both Bitbucket Cloud and Bitbucket Server (self-hosted instances)
#### API Tokens (Recommended for Bitbucket Cloud)
Bitbucket has deprecated App Passwords in favor of API Tokens. This is the recommended authentication method for Bitbucket Cloud.
1. Create an [API token](https://support.atlassian.com/bitbucket-cloud/docs/create-an-api-token/) from your Atlassian account settings
1. **Important**: When creating the token, grant **all read scopes** (Account: Read, Workspace membership: Read, Projects: Read, Repositories: Read, etc.) to ensure ghorg can list and clone repositories
1. Set `GHORG_BITBUCKET_API_TOKEN` in your `$HOME/.config/ghorg/conf.yaml` or use the `--token` flag
1. Set `GHORG_BITBUCKET_API_EMAIL` to your Atlassian account email (or use `--bitbucket-api-email`)
1. Update SCM type to `bitbucket` in your `ghorg/conf.yaml` or via cli flags
1. See [examples/bitbucket.md](https://github.com/gabrie30/ghorg/blob/master/examples/bitbucket.md) on how to run
> Note: When using API tokens, ghorg automatically uses `x-bitbucket-api-token-auth` as the Git username for clone operations, as required by Bitbucket's API token authentication.
#### App Passwords (Legacy)
> Note: Bitbucket has deprecated App Passwords. Consider using API Tokens instead.
#### PAT/OAuth token
1. Create a [PAT](https://confluence.atlassian.com/bitbucketserver/personal-access-tokens-939515499.html)
1. Set the token with `GHORG_BITBUCKET_OAUTH_TOKEN` in your `$HOME/.config/ghorg/conf.yaml` or using the `--token` flag. Make sure you do not have `--bitbucket-username` set.
1. Update SCM TYPE to `bitbucket` in your `ghorg/conf.yaml` or via cli flags
1. See [examples/bitbucket.md](https://github.com/gabrie30/ghorg/blob/master/examples/bitbucket.md) on how to run
#### Bitbucket Server (Self-hosted)
1. To configure with Bitbucket Server you will need to provide your instance URL via `GHORG_SCM_BASE_URL` in your `$HOME/.config/ghorg/conf.yaml` or use the `--base-url` flag.
1. Create credentials (username/password or app password) and update your configuration or use the `--bitbucket-username` and `--token` flags.
1. For insecure connections (HTTP), set `GHORG_INSECURE_BITBUCKET_CLIENT=true`
1. Update [SCM type](https://github.com/gabrie30/ghorg/blob/master/sample-conf.yaml#L54-L57) to `bitbucket` in your `ghorg/conf.yaml` or via cli flags
1. See [examples/bitbucket.md](https://github.com/gabrie30/ghorg/blob/master/examples/bitbucket.md) on how to run
## How to Use
See [examples](https://github.com/gabrie30/ghorg/tree/master/examples) directory for more SCM specific docs or use the examples command e.g. `ghorg examples gitlab`
```bash
$ ghorg clone kubernetes --token=bGVhdmUgYSBjb21tZW50IG9uIGlzc3VlIDY2
# Example how to use --token with a file path
$ ghorg clone kubernetes --token=~/.config/ghorg/gitlab-token.txt
$ ghorg clone davecheney --clone-type=user --token=bGVhdmUgYSBjb21tZW50IG9uIGlzc3VlIDY2
$ ghorg clone gitlab-examples --scm=gitlab --preserve-dir --token=bGVhdmUgYSBjb21tZW50IG9uIGlzc3VlIDY2
$ ghorg clone gitlab-examples/wayne-enterprises --scm=gitlab --token=bGVhdmUgYSBjb21tZW50IG9uIGlzc3VlIDY2
$ ghorg clone all-groups --scm=gitlab --base-url=https://gitlab.internal.yourcompany.com --preserve-dir
$ ghorg clone --help
# view cloned resources
$ ghorg ls
$ ghorg ls someorg
$ ghorg ls someorg | xargs -I %s mv %s bar/
```
## Changing Clone Directories
1. By default ghorg will clone the org or user repos into a directory like `$HOME/ghorg/org`. If you want to clone the org to a different directory use the `--path` flag or set `GHORG_ABSOLUTE_PATH_TO_CLONE_TO` in your ghorg conf. **This value must be an absolute path**. For example if you wanted to clone the kubernetes org to `/tmp/ghorg` you would run the following command.
```
$ ghorg clone kubernetes --path=/tmp/ghorg
```
which would create...
```
/tmp/ghorg
└── kubernetes
├── apimachinery
├── gengo
├── git-sync
├── kubeadm
├── kubernetes-template-project
├── ...
```
1. If you want to change the name of the directory the repos get cloned into, set the `GHORG_OUTPUT_DIR` in your ghorg conf or set the `--output-dir` flag. For example to clone only the repos starting with `sig-` from the kubernetes org into a direcotry called `kubernetes-sig-only`. You would run the following command.
```
$ ghorg clone kubernetes --match-regex=^sig- --output-dir=kubernetes-sig-only
```
which would create...
```
$HOME/ghorg
└── kubernetes-sig-only
├── sig-release
├── sig-security
└── sig-testing
```
## Selective Repository Cloning
Ghorg provides several optional ways to narrow down which repositories get cloned. Filters are applied in this order: **flag-based filters → `--target-repos-path` → `ghorgonly` → `ghorgignore`**. They can work in combination for a fine-grained control.
### Flag-based filters
- **Match by regex**: use `--match-regex` to include, or `--exclude-match-regex` to exclude, repos whose names match a regex.
- **Match by prefix**: use `--match-prefix` to include, or `--exclude-match-prefix` to exclude, repos whose names start with one or more prefixes.
- **Skip archived repos**: use `--skip-archived` (not supported on Bitbucket).
- **Skip forked repos**: use `--skip-forks`.
- **Filter by topic**: use `--topics` (or `GHORG_TOPICS`) to clone only repos tagged with a matching [topic](https://docs.github.com/en/repositories/managing-your-repositorys-settings-and-features/customizing-your-repository/classifying-your-repository-with-topics). GitHub, GitLab, and Gitea only.
#### `--target-repos-path` - explicit list of repo names
Maintain a file containing the exact **repository names** you want to clone (one per line) and point at it with `--target-repos-path` (or the `GHORG_TARGET_REPOS_PATH` env var). Only repos whose name appears in the file will be cloned. This is the right choice when you have a fixed, known set of repos e.g. a curated list driving a CI pipeline or a documented backup set.
- Matching is done against the **repo name only** (the basename of the clone URL with `.git` stripped) and is **case-insensitive and exact** — partial names will not match.
- There is no default location; you must always pass an explicit path via the flag or env var.
- Names listed in the file that don't exist on the org/user are reported in the clone summary so you can catch typos or repos that were renamed/removed.
- When `--clone-wiki` is enabled, each listed name will also match its corresponding `.wiki` repo. GitLab snippets are matched against their parent repo name.
#### `ghorgonly` - auto-detected substring allowlist file
`ghorgonly` is a plain-text file you create yourself. If a file exists at `$HOME/.config/ghorg/ghorgonly`, ghorg will pick it up automatically on every clone, no flag required, and only clone repos whose clone URL **contains** one of the listed substrings (one pattern per line). This is useful when you want a dynamic, pattern-based subset of a large organization e.g. "everything published under the `infra-` namespace" without having to enumerate every repo name by hand.
- Matching is a plain substring check against the **full clone URL**, so patterns like `infra-`, `internal/platform/`, or a full URL all work.
- If no file exists at the default path, this filter is skipped entirely.
- To use a different location, or to maintain **several `ghorgonly` files for different clone scenarios** (e.g. one per team or per environment), pass `--ghorgonly-path` (or set `GHORG_ONLY_PATH`) to select the right file for that run.
- `ghorgonly` is applied **after** `--target-repos-path` and **before** `ghorgignore`, so the three can be layered (e.g., target a known list, narrow further by substring, then drop a few specific repos).
**`ghorgonly` vs. `--target-repos-path`:** use `--target-repos-path` when you know the **exact set of repo names** you want and want missing-repo warnings; use `ghorgonly` when you want a **pattern-based subset** of the org's URLs and don't need an explicit list.
#### `ghorgignore` - auto-detected denylist file
`ghorgignore` is a plain-text file you create yourself. If a file exists at `$HOME/.config/ghorg/ghorgignore`, ghorg will pick it up automatically on every clone, no flag required, and skip any repo whose clone URL **contains** a substring listed in the file (one pattern per line). This is the right tool for permanently excluding a small set of repos (legacy mirrors, vendor forks, repos you simply don't want on disk).
- Make each entry as specific as possible to avoid unintentional matches. For example, prefer a full clone URL like `https://github.com/gabrie30/ghorg.git` or `git@github.com:gabrie30/ghorg.git` over a short name fragment.
- If no file exists at the default path, this filter is skipped entirely.
- To use a different location, or to maintain **several `ghorgignore` files for different clone scenarios** (e.g. a stricter list for backups vs. a lighter list for day-to-day clones), pass `--ghorgignore-path` (or set `GHORG_IGNORE_PATH`) to select the right file for that run.
## Creating Backups
When taking backups the notable flags are `--backup`, `--clone-wiki`, and `--include-submodules`. The `--backup` flag will clone the repo with [git clone --mirror](https://www.git-scm.com/docs/git-clone#Documentation/git-clone.txt---mirror). The `--clone-wiki` flag will include any wiki pages the repo has. If you want to include any submodules you will need `--include-submodules`. Lastly, if you want to exclude any binary files use the the flag `--git-filter=blob:none` to prevent them from being cloned.
```
ghorg clone kubernetes --backup --clone-wiki --include-submodules
```
This will create a kubernetes_backup directory for the org. Each folder inside will contain the .git contents for the source repo. To restore the code from the .git contents you would move all contents into a .git dir, then run `git init` inside the dir, then checkout branch e.g.
```sh
# inside kubernetes_backup dir, to restore kubelet source code
cd kubelet
mkdir .git
mv -f * .git # moves all contents into .git directory
git init
git checkout master
```
## Reclone Command
The `ghorg reclone` command is a way to store all your `ghorg clone` commands in one configuration file and makes calling long or multiple `ghorg clone` commands easier.
Once your [reclone.yaml](https://github.com/gabrie30/ghorg/blob/master/sample-reclone.yaml) configuration is set you can call `ghorg reclone` to clone each entry individually or clone all at once, see examples below.
Each reclone entry can have:
- `cmd`: The ghorg clone command to execute (required)
- `description`: A description of what the command does (optional)
- `post_exec_script`: Path to a script that will be called after the clone command finishes (optional). The script will always be called, regardless of success or failure, and receives two arguments: the status (`success` or `fail`) and the name of the reclone entry. This allows you to implement custom notifications, monitoring, or other automation (optional)
Example `reclone.yaml` entry:
```yaml
gitlab-examples:
cmd: "ghorg clone gitlab-examples --scm=gitlab --token=XXXXXXX"
post_exec_script: "/path/to/notify.sh"
```
Example script for `post_exec_script` (e.g. `/path/to/notify.sh`):
```sh
#!/bin/sh
STATUS="$1"
NAME="$2"
if [ "$STATUS" = "success" ]; then
# Success webhook
curl -fsS https://hc-ping.com/your-uuid-here
else
# Failure webhook
curl -fsS https://hc-ping.com/your-uuid-here/fail
fi
```
```
# To clone all the entries in your reclone.yaml omit any arguments
ghorg reclone
```
```
# To run one or more entries you can pass arguments
ghorg reclone kubernetes-sig-staging kubernetes-sig
```
```
# To view all your reclone commands
# NOTE: This command prints tokens to stdout
ghorg reclone --list
```
#### Setup
Add a [reclone.yaml](https://github.com/gabrie30/ghorg/blob/master/sample-reclone.yaml) to your `$HOME/.config/ghorg` directory. You can use the following command to set it for you with examples to use as a template
```
curl https://raw.githubusercontent.com/gabrie30/ghorg/master/sample-reclone.yaml > $HOME/.config/ghorg/reclone.yaml
```
Update file with the commands you wish to run.
#### Automating Reclone
For automated execution, ghorg ships with two companion commands. See the linked examples for usage, flags, and endpoint details:
- **`ghorg reclone-server`** — Start an HTTP server that triggers reclone commands via HTTP requests. See [examples/reclone-server.md](https://github.com/gabrie30/ghorg/blob/master/examples/reclone-server.md).
- **`ghorg reclone-cron`** — Run reclone on a scheduled interval. See [examples/reclone-cron.md](https://github.com/gabrie30/ghorg/blob/master/examples/reclone-cron.md).
## Using Docker
The provided images are built for both `amd64` and `arm64` architectures and are available solely on Github Container Registry [ghcr.io](https://github.com/gabrie30/ghorg/pkgs/container/ghorg).
```shell
# Should print help message
# You can also specify a version as the tag, such as ghcr.io/gabrie30/ghorg:v1.9.9
docker run --rm ghcr.io/gabrie30/ghorg:latest
```
> Note: There are also tags available for the latest on trunk, such as `master` or `master-`, but these **are not recommended**.
The commands for ghorg are parsed as docker commands. The entrypoint is the `ghorg` binary, hence you only need to enter remaining arguments as follows:
```shell
docker run --rm ghcr.io/gabrie30/ghorg \
clone kubernetes --token=bGVhdmUgYSBjb21tZW50IG9uIGlzc3VlIDY2
```
The image ships with the following environment variables set:
```shell
GHORG_CONFIG=/config/conf.yaml
GHORG_RECLONE_PATH=/config/reclone.yaml
GHORG_ABSOLUTE_PATH_TO_CLONE_TO=/data
```
These can be overriden, if necessary, by including the `-e` flag to the docker run comand, e.g. `-e GHORG_GITHUB_TOKEN=bGVhdmUgYSBjb21tZW50IG9uIGlzc3VlIDY2`.
### Persisting Data on the Host
In order to store data on the host, it is required to bind mount a volume:
- `$HOME/.config/ghorg:/config`: Mounts your config directory inside the container, to access `config.yaml` and `reclone.yaml`.
- `$HOME/repositories:/data`: Mounts your local data directory inside the container, where repos will be downloaded by default.
```shell
docker run --rm \
-e GHORG_GITHUB_TOKEN=bGVhdmUgYSBjb21tZW50IG9uIGlzc3VlIDY2 \
-v $HOME/.config/ghorg:/config `# optional` \
-v $HOME/repositories:/data \
ghcr.io/gabrie30/ghorg:latest \
clone kubernetes --match-regex=^sig
```
> Note: Altering `GHORG_ABSOLUTE_PATH_TO_CLONE_TO` will require changing the mount location from `/data` to the new location inside the container.
A shell alias might make this more practical:
```shell
alias ghorg="docker run --rm -v $HOME/.config/ghorg:/config -v $HOME/repositories:/data ghcr.io/gabrie30/ghorg:latest"
# Using the alias: creates and cleans up the container
ghorg clone kubernetes --match-regex=^sig
```
## Tracking Clone Data Over Time
To track data on your clones over time, you can use the ghorg stats feature. It is recommended to enable ghorg stats in your configuration file by setting `GHORG_STATS_ENABLED=true`. This ensures that each clone operation is logged automatically without needing to set the command line flag `--stats-enabled` every time. **The ghorg stats feature is disabled by default and needs to be enabled.**
When ghorg stats is enabled, the CSV file `_ghorg_stats.csv` is created in the directory specified by `GHORG_ABSOLUTE_PATH_TO_CLONE_TO`. This file contains detailed information about each clone operation, which is useful for auditing and tracking purposes such as the size of the clone and the number of new commits over time.
Below are the headers and their descriptions. Note that these headers may change over time. If there are any changes in the headers, a new file named `_ghorg_stats_new_header_${sha256HashOfHeader}.csv` will be created to prevent incorrect data from being added to your CSV.
- **datetime**: Date and time of the clone in YYYY-MM-DD hh:mm:ss format
- **clonePath**: Location of the clone directory
- **scm**: Name of the source control used
- **cloneType**: Either user or org clone
- **cloneTarget**: What is specified after the clone command `ghorg clone `
- **totalCount**: Total number of resources expected to be cloned or pulled
- **newClonesCount**: Sum of all new repos cloned
- **existingResourcesPulledCount**: Sum of all repos that were pulled
- **dirSizeInMB**: The size in megabytes of the output dir
- **newCommits**: Sum of all new commits in all repos pulled
- **cloneInfosCount**: Number of clone Info messages
- **cloneErrorsCount**: Number of clone Issues/Errors
- **updateRemoteCount**: Number of remotes updated
- **pruneCount**: Number of repos pruned
- **hasCollisions**: If there were any name collisions, only can happen with gitlab clones
- **ghorgignore**: If a ghorgignore was used in the clone
- **ghorgonly**: If a ghorgonly was used in the clone
- **totalDurationSeconds**: Total time in seconds for the entire clone operation
- **ghorgVersion**: Version of ghorg used in the clone
#### Converting CSV to JSON
```bash
go install github.com/gabrie30/csvToJson@latest && \
csvToJson _ghorg_stats.csv
```
## Windows support
Windows is supported when built with golang or as a [prebuilt binary](https://github.com/gabrie30/ghorg/releases/latest) however, the readme and other documentation is not geared towards Windows users.
Alternatively, Windows users can also install ghorg using [scoop](https://scoop.sh/#/)
```
scoop bucket add main
scoop install ghorg
```
## Troubleshooting
- If you are having trouble cloning repos. Try to clone one of the repos locally e.g. manually running `git clone https://github.com/your_private_org/your_private_repo.git` if this does not work, ghorg will also not work. Your git client must first be setup to clone the target repos. If you normally clone using an ssh key use the `--protocol=ssh` flag with ghorg. This will fetch the ssh clone urls instead of the https clone urls.
- If you are cloning a large org you may see `Error: open /dev/null: too many open files` which means you need to increase your ulimits, there are lots of docs online for this. Another solution is to decrease the number of concurrent clones. Use the `--concurrency` flag to set to lower than 25 (the default)
- If your GitHub org is behind SSO, you will need to authorize your token, see [here](https://docs.github.com/en/github/authenticating-to-github/authorizing-a-personal-access-token-for-use-with-saml-single-sign-on)
- If your GitHub Personal Access Token is only finding public repos, give your token all the repos permissions
- Make sure your `$ git --version` is >= 2.19.0
- Check for other software, such as anti-malware, that could interfere with ghorgs ability to create large number of connections, see [issue 132](https://github.com/gabrie30/ghorg/issues/132#issuecomment-889357960). You can also lower the concurrency with `--concurrency=n` default is 25.
- To debug yourself you can call ghorg with the GHORG_DEBUG=true env e.g `GHORG_DEBUG=true ghorg clone kubernetes`. Note, when this env is set concurrency is set to a value of 1 and will expose the api key used to stdout.
- If you've gotten this far and still have an issue feel free to raise an issue
- If you’re cloning using https, but you have submodules which are configured to use ssh, you can force git to pull these submodules as well via https by running these commands before running ghorg:
```
git config --global url."https://github.com/".insteadOf git@github.com:
git config --global credential.https://github.com/.helper '! f() { echo username=x-access-token; echo password=$GHORG_GITHUB_TOKEN; };f'
```