Please note, this is a STATIC archive of website www.w3resource.com from 19 Jul 2022, cach3.com does not collect or store any user information, there is no "phishing" involved.
w3resource

Python Web Scraping: Verify SSL certificates for HTTPS requests using requests module

Python Web Scraping: Exercise-27 with Solution

Write a Python program to verify SSL certificates for HTTPS requests using requests module.

Note: Requests verifies SSL certificates for HTTPS requests, just like a web browser. By default, SSL verification is enabled, and Requests will throw a SSLError if it’s unable to verify the certificate

Sample Solution:

Python Code:

import requests
print(requests.get('https://w3resource.com'))
print(requests.get('https://wayback.com'))
  
 

Sample Output:

<Response [200]>
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py", line 600, in urlopen
    chunked=chunked)
  File "/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py", line 343, in _make_request
    self._validate_conn(conn)
  File "/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py", line 839, in _validate_conn
    conn.connect()
  File "/usr/local/lib/python3.6/dist-packages/urllib3/connection.py", line 364, in connect
    _match_hostname(cert, self.assert_hostname or server_hostname)
  File "/usr/local/lib/python3.6/dist-packages/urllib3/connection.py", line 374, in _match_hostname
    match_hostname(cert, asserted_hostname)
  File "/usr/lib/python3.6/ssl.py", line 327, in match_hostname
    % (hostname, ', '.join(map(repr, dnsnames))))
ssl.CertificateError: hostname 'wayback.com' doesn't match either of '*.prod.iad2.secureserver.net', 'prod.iad2.secureserver.net'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/requests/adapters.py", line 449, in send
    timeout=timeout
  File "/usr/local/lib/python3.6/dist-packages/urllib3/connectionpool.py", line 638, in urlopen
    _stacktrace=sys.exc_info()[2])
  File "/usr/local/lib/python3.6/dist-packages/urllib3/util/retry.py", line 398, in increment
    raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='wayback.com', port=443): Max retries exceeded with url: / (Caused by SSLError(CertificateError("hostname 'wayback.com' doesn't match either of '*.prod.iad2.secureserver.net', 'prod.iad2.secureserver.net'",),))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/tmp/sessions/d2a193285f0c778a/main.py", line 3, in <module>
    print(requests.get('https://wayback.com'))
  File "/usr/local/lib/python3.6/dist-packages/requests/api.py", line 75, in get
    return request('get', url, params=params, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/requests/api.py", line 60, in request
    return session.request(method=method, url=url, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/requests/sessions.py", line 533, in request
    resp = self.send(prep, **send_kwargs)
  File "/usr/local/lib/python3.6/dist-packages/requests/sessions.py", line 646, in send
    r = adapter.send(request, **kwargs)
  File "/usr/local/lib/python3.6/dist-packages/requests/adapters.py", line 514, in send
    raise SSLError(e, request=request)
requests.exceptions.SSLError: HTTPSConnectionPool(host='wayback.com', port=443): Max retries exceeded with url: / (Caused by SSLError(CertificateError("hostname 'wayback.com' doesn't match either of '*.prod.iad2.secureserver.net', 'prod.iad2.secureserver.net'",),))

Flowchart:

Python Web Scraping Flowchart: Verifiy SSL certificates for HTTPS requests using requests module

Python Code Editor:


Have another way to solve this solution? Contribute your code (and comments) through Disqus.

Previous: Write a Python program to display the contains of different attributes like status_code, headers, url, history, encoding, reason, cookies, elapsed, request and content of a specified resource.

What is the difficulty level of this exercise?



Python: Tips of the Day

Find current directory and file's directory:

To get the full path to the directory a Python file is contained in, write this in that file:

import os 
dir_path = os.path.dirname(os.path.realpath(__file__))

(Note that the incantation above won't work if you've already used os.chdir() to change your current working directory, since the value of the __file__ constant is relative to the current working directory and is not changed by an os.chdir() call.)

To get the current working directory use

import os
cwd = os.getcwd()

Documentation references for the modules, constants and functions used above:

  • The os and os.path modules.
  • The __file__ constant
  • os.path.realpath(path) (returns "the canonical path of the specified filename, eliminating any symbolic links encountered in the path")
  • os.path.dirname(path) (returns "the directory name of pathname path")
  • os.getcwd() (returns "a string representing the current working directory")
  • os.chdir(path) ("change the current working directory to path")

Ref: https://bit.ly/3fy0R6m