fix: remap dead hg.python.org release-notes URLs to GitHub equivalents#2965
fix: remap dead hg.python.org release-notes URLs to GitHub equivalents#2965nagasrisai wants to merge 4 commits intopython:mainfrom
Conversation
Adds a corrected_release_notes_url property to the Release model that converts old Mercurial-hosted URLs (hg.python.org, now unreachable) to their equivalent paths on GitHub, so legacy release pages still have working changelog links. Closes python#2865
Use the new corrected_release_notes_url property so that legacy hg.python.org links are converted to GitHub URLs before rendering. Also guard the anchor so it degrades gracefully when the URL is empty.
Covers hg URL conversion, https variant, modern URLs left unchanged, and empty URL passthrough.
|
@JacobCoffee @ewdurbin @sethmlarson — could one of you take a look when you get a chance? This fixes the broken release notes links on the downloads page for older Python releases. No migration needed. Thanks! |
|
Looks like the Actions workflows need maintainer approval to run since this is coming from a fork — it's the first contribution from this account. Could @JacobCoffee or @ewdurbin approve the three workflow runs (CI, Lint, Check collectstatic) when you get a chance? The links are in the Checks tab on this PR. Thanks! |
hugovk
left a comment
There was a problem hiding this comment.
Rather than rewriting the URLs on serve, would this be better as a one-off database migration?
| match = re.match(r"https?://hg\.python\.org/cpython/file/([^/]+)/(.+)", url) | ||
| if match: | ||
| tag, path = match.group(1), match.group(2) | ||
| return f"https://github.com/python/cpython/blob/{tag}/{path}" |
There was a problem hiding this comment.
re.match can be confusing because it anchors the start of the string, but not the end.
Can we use re.fullmatch instead? Or re.search with explicit ^?
https://docs.python.org/3/library/re.html#search-vs-match
Or better yet, ditch the regex:
| match = re.match(r"https?://hg\.python\.org/cpython/file/([^/]+)/(.+)", url) | |
| if match: | |
| tag, path = match.group(1), match.group(2) | |
| return f"https://github.com/python/cpython/blob/{tag}/{path}" | |
| for prefix in ( | |
| "http://hg.python.org/cpython/file/", | |
| "https://hg.python.org/cpython/file/", | |
| ): | |
| if url.startswith(prefix): | |
| return "https://github.com/python/cpython/blob/" + url[len(prefix):] | |
| return url |
There was a problem hiding this comment.
Actually:
- https://www.python.org/downloads/release/python-336/ uses a URL with
/file/:https://hg.python.org/cpython/file/v3.3.6/Misc/NEWS - https://www.python.org/downloads/release/python-273/ uses a URL with
/raw-file/:http://hg.python.org/cpython/raw-file/v2.7.3/Misc/NEWS
We should handle both.
There was a problem hiding this comment.
Good point, the str.startswith approach is much cleaner. Will switch to that — also makes it easier to extend for the raw-file case you spotted below.
There was a problem hiding this comment.
Good catch, missed that one. Will handle both /file/ and /raw-file/ in the updated version.
|
Thanks for the thorough review! On the migration question — happy to go either way. The on-serve approach felt lighter since there are only a handful of affected legacy releases, but a one-off migration is cleaner long-term and avoids the runtime check on every page load. I'll update the PR to use |
The "Release notes" links on the downloads page for Python 3.3.6 and earlier are broken — they point to hg.python.org which was retired and now returns 404.
Added a
corrected_release_notes_urlproperty to theReleasemodel that checks whether a stored URL points to hg.python.org and, if so, remaps it to the equivalent path on GitHub. For example,http://hg.python.org/cpython/file/v3.3.6/Misc/NEWSbecomeshttps://github.com/python/cpython/blob/v3.3.6/Misc/NEWS. The database field itself is left untouched so no migration is needed.Also updated the downloads index template to use the corrected URL, and added a guard so releases with no URL at all render as plain text rather than an empty anchor.
Four tests added to
ReleaseNotesURLTestscovering the http and https hg variants, a modern URL that should pass through unchanged, and an empty URL.Closes #2865