This is a short post to document an evolution of MakeDigikeyBoM that was necessary to support creating the BoM for the HappyDay Energy Monitor. 

I’m in process getting to the point I can lay out the HappyDay Energy Monitor PCB.  Since I used MakeDigikeyBoM, I’ve evolved the Python I am using on my Mac to an Anaconda install:

Python 3.6.1 |Anaconda custom (x86_64)| (default, May 11 2017, 13:04:09)

Previously, I was using Python 2.7. The version that comes with the Mac OS.

The changes I made:

  • Used Atom instead of Eclipse.  I find Atom easier to work within than Eclipse.
  • Created a new GitHub repo: BitKnitting/MakeDigikeyBOM_Python3.
  • Evolved Python 2.7 to 3 libraries.  I documented some of this.  The rest was done by bumbling through the scripts and getting them to work.
  • Changed method to open a URL.  Using the previous method, Digikey decided my code was a bot.  After a few opens, the Digikey server returned 403 – Forbidden.  My “workaround” is to use a different user agent.  The things I did to fix this:
    • Remove /scripts from the Digikey URL.  As noted by glidud in their comment to my question , “Digikey’s robots.txt signals pretty clearly that they don’t want bots scraping their catalog data. (Edit: At least, they’ve specifically disallowed /scripts/. You may try going directly to a product’s /product-detail/ page.)”    The URL I was attaching the part number to open was previously like this one: https://www.digikey.com/scripts/DkSearch/dksus.dll?WT.z_header=search_go&lang=en&keywords=SI8651BB-B-IS1 .  I changed this to: url = ‘https://www.digikey.com/products/en?keywords=’ + part_number
    • If a 403 was returned, switched to a different browser asking for the request.  I did find out requests assumed to be from a mobile device did not return the info that was needed.  Here is the function (located in makeDigikeyFile.py):
from urllib.request import urlopen, Request, URLError, HTTPError

def getURLpage(url):

    userAgentStrings = [‘Mozilla/5.0 (Macintosh; Intel Mac OS X 10_8_0) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3’,

    ‘Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_1) AppleWebKit/536.3 (KHTML, like Gecko) Chrome/19.0.1063.0 Safari/536.3’,

    ‘Mozilla/5.0 (Macintosh; Intel Mac OS X 10_10_4) AppleWebKit/600.7.12 (KHTML, like Gecko) Version/8.0.7 Safari/600.7.12’]

 

    numTries = 0

    while True:

        try:

            req = Request(url)

            req.add_header(‘Accept-Language’, ‘en-US’)

            # I was getting 403’s – Digikey thought I was a bot.  So if I do,

            # note HTTPError tries again…

            req.add_header(‘User-agent’, random.choice(userAgentStrings) )

            with urlopen(req) as response:

                html = response.read()

                break

        except HTTPError as e:

            logStr = “Denied access: {}.  Reason: {}”.format(url,e.reason)

            logger.info(logStr)

            if (e.code == 403 and numTries < 10):

                logger.info(“Trying again….”)

                numTries += 1

                continue

            else:

                break

    return html

Advertisements