personal coding experiences: 2017

BFS is an graph searching algorithm.

Graph????????

Has nodes (vertices)
Has Edges (links) -can have a value(weight)

This is an example, Numbers show nodes and links connect

them.

Graphs can be directed or undirected.

picture shows an undirected graph. you can easily identify it... no arrows in links lads......

In directed graph ,,,

you can guess - can be traveled among nodes only in the indicated direction

My sample code for the BFS in python




to_visit = [0]



parent = {}

level4 = []

nodes = [0,1,2,3,4,5]

i = 0

while to_visit != [] and i <= len(to_visit)-1:

    front = to_visit[i]

    #del to_visit[0]



    for j in nodes:

        

        if j not in to_visit and adj[front][j] == 1:

            to_visit.append(j)

            parent[j]=front

    print to_visit,parent

    i+=1

BFS can be used to get the shortest path in the graph.
parent nodes of each node indicate the path

First you have to,

Install beautifulsoup4 and requests libraries.

pip install beautifulsoup4
pip install requests

import these libraries to script

from bs4 import BeautifulSoup
import requests

Now the Fun Part begins,

There are few step before using we should do
by using requests we retrieve data from a specific URL using GET Request, and the response is stored in r variable. We use requests get method for it.

r=requests.get(url)

Then the content in specific response (by the way it is html content) used to create beautifulsoup soup object.

soup=BeautifulSoup(r.content,"html.parser")

html parser is optional,you know that everything in r.content is html right!!

Actually to do an any web scrape you would only need to know 3 keyword and that's all.Rest is up up you.
Here are they,

findAll() function
.contents
.text

findAll() function

soupObject.findAll("element",{"property":'name'}[optional])

this returns all the html content having these properties in a List form.

.contents

item.contents

converts immediate child elements inside html element to a list form.

.text

get the text (content visible to you in website) in html elements without any html elements.

This is an sample example for web scraping and store data in an excel sheet.

from bs4 import BeautifulSoup
import requests




def getSoupObject(url="http://www.list.com/search/home.html"):

    
    r=requests.get(url)
    soup=BeautifulSoup(r.content,"html.parser")
    return  soup


def getDataFromPage(soupObj):

    lst=[]

    divCont= soupObj.findAll("div",{"class":'list_l_box'})

    for item in divCont:
        itemList=item.contents

        if itemList[1].findAll('img',{"alt":"No image"}) !=[]:
            continue
        else:
            companyDataList=itemList[3].contents

            if companyDataList[5].findAll("span",{"itemprop":"telephone"})!=[]:

                companyName=companyDataList[1].text
                companyTele=companyDataList[5].findAll("span",{"itemprop":"telephone"})[0].text
                temp=[]
                temp.append(companyName)
                temp.append(companyTele)
                

                lst.append(temp)
    return lst

personal coding experiences

Pages

Saturday, January 21, 2017

BFS (Boy Friend Search)....I am kidding its --Breadth First Search--

Friday, January 20, 2017

BeautifulSoup4 extremely beginner guide