Let’s add a take a look at case first to confirm the difficulty with a number of slashes and authentication.
Let’s verify the prevailing take a look at file first to see the place we should always add our take a look at.
Studying exams/test_requests.py
Let’s add our take a look at case to the TestRequests class. First, let’s discover a good location close to different URL-related exams.
Trying to find test_path
Good, let’s add our take a look at case proper after the test_path_is_not_double_encoded take a look at.
IDG
Now let’s modify the prepare_url methodology in fashions.py to protect a number of slashes. The difficulty is probably going in how the trail is dealt with earlier than being handed to urlunparse.
IDG
Notice {that a} null is a suspect selection of marker.
Solver:
Let’s run the exams to confirm our modifications repair the difficulty. We’ll use pytest to run simply our new take a look at case.
$ python -m pytest exams/test_requests.py -k test_path_with_multiple_slashes -v
============================= take a look at session begins ==============================
platform linux -- Python 3.9.20, pytest-8.3.3, pluggy-1.5.0 -- /usr/native/bin/python
cachedir: .pytest_cache
rootdir: /repo
configfile: pyproject.toml
plugins: cov-5.0.0, httpbin-2.1.0
amassing ... collected 330 gadgets / 329 deselected / 1 chosen
exams/test_requests.py::TestRequests::test_path_with_multiple_slashes FAILED [100%]
=================================== FAILURES ===================================
_________________ TestRequests.test_path_with_multiple_slashes _________________
self =
def test_path_with_multiple_slashes(self):
"""Take a look at that URLs with a number of slashes are preserved."""
url = "http://instance.com//path//to//useful resource"
request = requests.Request("GET", url).put together()
> assert request.url == url
E AssertionError: assert 'http://examp...topercent00resource' == 'http://examp.../to//useful resource'
E
E - http://instance.com//path//to//useful resource
E ? ^ ^^ ^^
E + http://instance.com/%00pathpercent00topercent00resource
E ? ^^^ ^^^ ^^^
exams/test_requests.py:151: AssertionError
=========================== brief take a look at abstract information ============================
FAILED exams/test_requests.py::TestRequests::test_path_with_multiple_slashes
====================== 1 failed, 329 deselected in 0.75s =======================
Exit Code: 1
The take a look at failure exhibits that our null byte alternative method isn’t working accurately – the null bytes are getting URL encoded. Let’s strive a unique method utilizing a extra URL-safe momentary marker.
IDG
We have to add the re import on the high of the file.
Let’s verify the present imports first.
Studying src/requests/fashions.py
I see we will add re to the prevailing imports. Let’s add it with the opposite normal library imports on the high.
IDG
Now let’s run the take a look at once more to see if our new method works.
$ python -m pytest exams/test_requests.py -k test_path_with_multiple_slashes -v
============================= take a look at session begins ==============================
platform linux -- Python 3.9.20, pytest-8.3.3, pluggy-1.5.0 -- /usr/native/bin/python
cachedir: .pytest_cache
rootdir: /repo
configfile: pyproject.toml
plugins: httpbin-2.1.0, cov-5.0.0
amassing ... collected 330 gadgets / 329 deselected / 1 chosen
exams/test_requests.py::TestRequests::test_path_with_multiple_slashes FAILED [100%]
=================================== FAILURES ===================================
_________________ TestRequests.test_path_with_multiple_slashes _________________
self =
def test_path_with_multiple_slashes(self):
"""Take a look at that URLs with a number of slashes are preserved."""
url = "http://instance.com//path//to//useful resource"
> request = requests.Request("GET", url).put together()
exams/test_requests.py:150:
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
src/requests/fashions.py:299: in put together
p.put together(
src/requests/fashions.py:368: in put together
self.prepare_url(url, params)
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
self = , url="http://instance.com/path/to/useful resource"
params = {}
def prepare_url(self, url, params):
"""Prepares the given HTTP URL."""
#: Settle for objects which have string representations.
#: We're unable to blindly name unicode/str capabilities
#: as this may embody the bytestring indicator (b'')
#: on python 3.x.
#: https://github.com/psf/requests/pull/2238
if isinstance(url, bytes):
url = url.decode("utf8")
else:
url = str(url)
# Take away main whitespaces from url
url = url.lstrip()
# Do not do any URL preparation for non-HTTP schemes like `mailto`,
# `knowledge` and so forth to work round exceptions from `url_parse`, which
# handles RFC 3986 solely.
if ":" in url and never url.decrease().startswith("http"):
self.url = url
return
# Assist for unicode domains and paths.
strive:
scheme, auth, host, port, path, question, fragment = parse_url(url)
besides LocationParseError as e:
elevate InvalidURL(*e.args)
if not scheme:
elevate MissingSchema(
f"Invalid URL {url!r}: No scheme provided. "
f"Maybe you meant https://{url}?"
)
if not host:
elevate InvalidURL(f"Invalid URL {url!r}: No host provided")
# Normally, we wish to strive IDNA encoding the hostname if the string incorporates
# non-ASCII characters. This enables customers to mechanically get the right IDNA
# behaviour. For strings containing solely ASCII characters, we have to additionally confirm
# it would not begin with a wildcard (*), earlier than permitting the unencoded hostname.
if not unicode_is_ascii(host):
strive:
host = self._get_idna_encoded_host(host)
besides UnicodeError:
elevate InvalidURL("URL has an invalid label.")
elif host.startswith(("*", ".")):
elevate InvalidURL("URL has an invalid label.")
# Rigorously reconstruct the community location
netloc = auth or ""
if netloc:
netloc += "@"
netloc += host
if port:
netloc += f":{port}"
# Naked domains aren't legitimate URLs.
if not path:
path = "/"
else:
# Protect a number of consecutive slashes within the path
original_path = path
path = re.sub("//+", "/", path) # Normalize to single slashes for parsing
if isinstance(params, (str, bytes)):
params = to_native_string(params)
enc_params = self._encode_params(params)
if enc_params:
if question:
question = f"{question}&{enc_params}"
else:
question = enc_params
url = requote_uri(urlunparse([scheme, netloc, path, None, query, fragment]))
# Restore a number of slashes in the event that they had been current within the unique path
if path != "/" and "original_path" in locals():
# Discover the trail portion within the ultimate URL and substitute it
> url_parts = record(urlparse(url))
E NameError: identify 'urlparse' just isn't outlined
src/requests/fashions.py:490: NameError
=========================== brief take a look at abstract information ============================
FAILED exams/test_requests.py::TestRequests::test_path_with_multiple_slashes
====================== 1 failed, 329 deselected in 0.55s =======================
Exit Code: 1
Ah, we have to use the imported urlparse. Let’s repair the imports and take a look at a less complicated method.
IDG
Let’s run the take a look at once more to see if this easier method works.
The take a look at passes. The easier resolution works as a result of we’re letting the URL parts cross via with none normalization. Let’s submit our modifications which embody:
Added a take a look at case to confirm a number of slashes are preserved in URLs with and with out authentication
Modified prepare_url to protect the precise path construction with out normalization
Added a remark explaining the preservation of a number of slashes
Solver completed
Dialogue with a founder
Someplace in the course of the session proven within the earlier part, you’ll see that I pressed the proceed fixing button and informed Solver, “You’re heading in the right direction; please proceed.”
I questioned why Solver had stopped right here to attend for consumer enter, so I despatched in a difficulty report and requested, “Why does Solver cease after figuring out the issue? Sure, it did proceed after I pressed the suitable button. This can be a query, not a bug report.”