# Training Disclosure for hierosoft-logistics This Training Disclosure, which may be more specifically titled above here (and in this document possibly referred to as "this disclosure"), is based on **Training Disclosure version 1.1.4** at https://github.com/Hierosoft/training-disclosure by Jake Gustafson. Jake Gustafson is probably *not* an author of the project unless listed as a project author, nor necessarily the disclosure editor(s) of this copy of the disclosure unless this copy is the original which among other places I, Jake Gustafson, state IANAL. The original disclosure is released under the [CC0](https://creativecommons.org/public-domain/cc0/) license, but regarding any text that differs from the original: This disclosure also functions as a claim of copyright to the scope described in the paragraph below since potentially in some jurisdictions output not of direct human origin, by certain means of generation at least, may not be copyrightable (again, IANAL): Various author(s) may make claims of authorship to content in the project not mentioned in this disclosure, which this disclosure by way of omission unless stated elsewhere implies is of direct human origin unless stated elsewhere. Such statements elsewhere are present and complete if applicable to the best of the disclosure editor(s) ability. Additionally, the project author(s) hereby claim copyright and claim direct human origin to any and all content in the subsections of this disclosure itself, where scope is defined to the best of the ability of the disclosure editor(s), including the subsection names themselves, unless where stated, and unless implied such as by context, being copyrighted or trademarked elsewhere, or other means of statement or implication according to law in applicable jurisdiction(s). Disclosure editor(s): Hierosoft LLC Project author: Hierosoft LLC This disclosure is a voluntary of how and where content in or used by this project was produced by LLM(s) or any tools that are "trained" in any way. The main section of this disclosure lists such tools. For each, the version, install location, and a scope of their training sources in a way that is specific as possible. Subsections of this disclosure contain prompts used to generate content, in a way that is complete to the best ability of the disclosure editor(s). tool(s) used: - GPT-4-Turbo (Version 4o, chatgpt.com) Scope of use: code described in subsections--typically modified by hand to improve logic, variable naming, integration, etc, but in this commit, unmodified. ## hierosoftlogistics ### regression_suite.py - September 14, 2024 invert this python dict, and make it an OrderedDict: {'poikilos': 'Poikilos', 'poikilos':"Hierosoft", 'expertmm': "Hierosoft", 'expertmm': "Poikilos", 'poikilos': "LibreCSG", 'Poikilos': "LibreCSG", "expertmm": "LibreCSG", 'poikilos': "samurai-ide", 'Poikilos': "samurai-ide", 'Hierosoft': "samurai-ide"} no, do not invert the dict using code. Invert the dict before that so that you don't have to remove elements. I see what you mean. Invert it manually using the original keys i gave as list items and the original values I gave as keys. so you don't have to discard anything. Ok, now make it an OrderedDict to retain the original order I gave. Whenever you provide any argument(s) to the OrderedDict constructor, you must retain the order as keyword arguments such as OrderedDict(name1=value1, name2=value2, ...). If that is impossible due to names that are not valid variable names, you must use a blank constructor then set each item of the dict using keys manually as separate statements. Order is the whole point of OrderedDict. - memory updated Ok, I've improved it a bit and want to write a program based on that to show a list of commands to fix the old urls in files. - actually above, I accidentally pasted this whole document including the prompt below. In a project called hierosoft-logistics lets make a script called regression_suite.py in a hierosoftlogistics module. Make a global DEFAULT_DOT_EXTS_LOWER = [.py, .cpp, .h, .hpp, .lua, .cs, .c, .md, .txt, .rst, .twig, .js, .htm, .html, .php]. Make a list SKIP_DIRS containing strings that are names of node modules or laravel folders or any other folders generated by package managers of any of the languages used in the file extensions in the DEFAULT_DOT_EXTS_LOWER list. then set ``` DEFAULT_MIGRATIONS = OrderedDict() DEFAULT_MIGRATIONS['Poikilos'] = ['poikilos'] DEFAULT_MIGRATIONS['Hierosoft'] = ['poikilos', 'expertmm'] DEFAULT_MIGRATIONS['LibreCSG'] = ['poikilos', 'Poikilos', 'expertmm'] DEFAULT_MIGRATIONS['samurai-ide'] = ['poikilos', 'Poikilos', 'Hierosoft'] def get_new(repo_name): for new, olds in NAME_CHANGES.keys(): if repo_name in olds: return new return None class ReposState: """Manage multiple repos at a time. See also RepoState.""" def __init__(self): self.new_owners = {} self.name_changes = { 'EnlivenMinetest': ['ENLIVEN'], } self.migrations = DEFAULT_MIGRATIONS self.dot_exts = DEFAULT_DOT_EXTS_LOWER self.recommended_actions = None self.remote_repo_count = None self.local_repo_count = None ``` Main should insantiate state = ReposState() and call a method check_everything(DEFAULT_MIGRATIONS). In check_everything(migrations), set self.actual_repos = OrderedDict(), then initialize old_orgs = OrderedDict(), self.remote_repo_count = 0, self.local_repo_count = 0, self.recommended_actions = []. For each new_org in migrations.keys(), use the github API to download the json list of all of the repos under the new_org, and parse the downloaded string. Make another loop in this loop that iterates repos found and fill self.actual_repos where the key is new_org and each item is a list of repo names. Also set new_owners[repo] = new_org, and self.remote_repo_count += 1. Then after the nested loops have completed, call a method check_files_here(parent) like self.check_files_here(os.path.abspath(".")). That method should have an optional argument depth with a default value of 0. It should iterate a list every sub in the parent directory. If depth == 0, self.local_repo_count += 1. if os.path.splitext(sub)[1].lower() not in self.dot_exts, continue. set path = os.path.join(parent, sub). if os.path.isdir(path) and sub not in SKIP_DIRS recurse properly by passing parent instead of ".", and also pass depth=depth+1, then use the continue statement. Just for safety if not os.path.isfile(path) logger.warning(f"Neither a file nor a directory: {path}") and use the continue statement in that case as well. Then below those checks, with no "else" nesting required since using short circuit logic (continue), open path in binary mode line by line. In that loop, make a sub-loop to iterate for old_org in old_orgs.keys(). Set new_owner = new_owners[sub] if sub in new_owners else "[new owner]. Set new_url = f"git@github.com:{new_owner}/{sub}.git".encode(utf-8). Set old_url = b"https://github.com/" + sub..encode("utf-8") + b"/" + repo. If old_url in line, set action = f'sed -i "s|{old_url}|{new_url}|g" "{path}"', then append that to self.recommended_actions. After everything is finished, main should check "if stat.recommended_actions" and say "# Recommended actions:" and then print each list item line by line, else print "# No recommended actions" call main like sys.exit(main()), and always return 0 from main since other cases handled by python using exceptions. use the urllib2 to fetch the repos and complete the fetch_repos function: set results = [], then page=None, while True: url = f"https://api.github.com/users/{org_name}/repos", if page is not None: url = f"https://api.github.com/users/{org_name}/repos?page={page}", parse the download as json page_object, then for repo_meta in page_object append repo_meta.get('name') to results. Check if there are more pages but use urrlib2 instead of curl, using the following official GitHub API documentation: When a response is paginated, the response headers will include a link header. If the endpoint does not support pagination, or if all results fit on a single page, the link header will be omitted. The link header contains URLs that you can use to fetch additional pages of results. For example, the previous, next, first, and last page of results. To see the response headers for a particular endpoint, you can use curl, GitHub CLI, or a library you're using to make requests. To see the response headers if you are using a library to make requests, follow the documentation for that library. To see the response headers if you are using curl or GitHub CLI, pass the --include flag with your request. For example: curl --include --request GET \ --url "https://api.github.com/repos/octocat/Spoon-Knife/issues" \ --header "Accept: application/vnd.github+json" If the response is paginated, the link header will look something like this: link: ; rel="prev", ; rel="next", ; rel="last", ; rel="first". (end of documentation) If there are no more pages break otherwise page = page + 1 if page else 2, then after the loop return results. along with the other short-circuit operations, if not os.path.isdir(os.path.join(path, ".git"): logger.warning("# {path} is not a git repo"), continue. My mistake, only the top level may be a git repo. Instead, in init set self.repo_depth = 0, then in the method change the check to if depth == self.repo_depth and not os.path.isdir(os.path.join(path, ".git")): Ok, now make a pyproject.toml using all fields possible, favoring classifiers and avoiding redundant info, assuming readme is readme.md, website is at https://github.com/Hierosoft/hierosoft-logistics, issues and other such urls are in "/issues" sub-url under that, and the command for main in hierosoftlogistics.regression_suite is regression-suite-hierosoft