Visit Flyai. club to create your AI project with one click



The author | FerventDesert @ blog garden

http://www.cnblogs.com/buptzym/


When you’ve run thousands of miles through the streets of a city, one obvious thought is, what would it look like if you could draw all the routes through the city?


Article code is more, in order not to hang people’s appetite, first look at the final effect, up to the north seven, down to the south third ring road, west to Dawang Road, east to the capital airport. The second ring road is 32 km, the third ring road is 50 km, this is the limit, the fourth ring road is not considered for the moment…


(during the course of making engineering has managed, https://github.com/ferventdesert/gpx-crawler)


Copy the code

1. Data source: Digital GPS

The first thing you need is raw location information. There are a lot of running apps on your phone, but the common problem with all of them is that they don’t allow free import and export (presumably to prevent users from leaving). So having a smart sports watch should be the way to go. Mine is Garmin Fenix3.



At the same time, yidong GPS is the conscience of the industry, able to synchronize data from Gudong, Garmin watches, and Yueyue Laps, so I used it as an entry point to grab all the GPS data.


As for how to synchronize, please refer to the related introduction on the website. Here is a screenshot after I logged in to the website:

http://edooon.com/user/5699607196/record/15414378



Click on it and you’ll see the export route button:



Incomparable pit is, it does not provide batch export button, hundreds of records, export in turn are tiring. So consider automating it with code.


2. Access to the data on the Yi Dong website


Once logged in, you can see that it loads dynamically, automatically loading subsequent content as the wheel rolls to the bottom. It is supposed to sniff and parse HTTP requests. Then I got lazy, took a compromise, left it all loaded, and saved the current HTML file.


The next step is to parse the Html, basically using XPath. Experienced students can see the picture below:



The highlighted part of the image is the actual address to download the GPX file. We’ll save it in urlList. Meanwhile, metadata is stored in JSON files.


folder = u'D:/buptzym sync disk/Baidu Cloud/My documents /data analysis /rungps/';
cookie='JSESSIONID=69DF607B71B1F14AFEC090F520B14B55; logincookie=5699607196$6098898D08E533587E82B33DD9D02196; persistent_cookie=5699607196$42C885AD38F59DCA407E09C95BE1A60B; uname_forloginform="[email protected]"; __utma = 54733311.82935663.1447906150.1447937410.1456907433.7; __utmb = 54733311.5.10.1456907433; __utmc=54733311; __utmz = 54733311.1456907433.7.3. Utmcsr = baidu | utmccn = (organic) | utmcmd = organic; cookie_site=auto'userid='5699607196';
f = codecs.open(folder + 'desert.htm'.'r'.'utf-8');
html = f.read();
f.close();
root = etree.HTML(html)
tree = etree.ElementTree(root);

listnode=tree.xpath('//*[@id="feedList"]');
numre=re.compile(u'ride | | | running kilometers, estimated consumption time | | |');
urllists=[]
records=[];for child in listnode[0].iterchildren():
    record={};
    temp=child.xpath('div[2]/div[1]/a[2]')    if len(temp)==0:        continue;
    source= temp[0].attrib['href'];
    record['id']=source.split('/')[-1];
    info=temp[0].text;
    numinfo= numre.split(info);    if len(numinfo)<6:        continue;
    record['type']= info[0:2];
    record['distance']= numinfo[1];
    record['hot']=numinfo[6];
    urllists.append('http://edooon.com/user/%s/record/export?type=gpx&id=%s' % (userid, record['id']));Copy the code


It is important to note that since cookies are required for downloading, readers will need to replace their own GPS userID and login cookie (it is not worth developing automatic login for such sites).


It’s easy to download, get an XPath for the URL of the export data button, construct a request with a cookie, and save the file.


opener = urllib.request.build_opener()
opener.addheaders.append(('Cookie', cookie));
path='//*[@id="exportList"]/li[1]/a';for everyURL in urllists:
    id = everyURL.split('=')[-1];    print(id);
    url='http://edooon.com/user/%s/record/%s' % (userid, id);
    f = opener.open(url);
    html = f.read();
    f.close();
    root = etree.HTML(html)
    tree = etree.ElementTree(root);
    fs = str(tree.xpath(path)[0]);    if fs is None:        continue;
    furl = 'http://edooon.com/user/%s/record/%s' % (userid, fs);
    f = opener.open(furl);
    html = f.read();
    f.close();
    filename=folder+'id'+'.gpx';
    xmlfile = codecs.open(filename, 'wb');
    xmlfile.write(html);
    xmlfile.close();Copy the code


Since then, we’ve saved about 300 GPX files.


3. Parse GPX data


The so-called GPX data is a general specification of GPS data format, detailed information can be searched.

We need to use python’s GPX parser, gpXpy is a good choice, use

Pip3 install Gpxpy.

Gpxpy provides a rich interface, but of course we only need to extract a portion of the data for statistics purposes:


def readgpx(x):
     
    file= open(dir+x+'.gpx'.'r')
    txt=file.read()
    gpx=gpxpy.parse(txt)
    mv=gpx.get_moving_data()
    dat= {'Movement time':mv.moving_time,'Rest time':mv.stopped_time,'Travel distance':mv.moving_distance,'Pause distance':mv.stopped_distance,'Maximum speed':mv.max_speed};
    dat['Total time']=(gpx.get_duration())
    dat['id']=str(x)
    updown=gpx.get_uphill_downhill()
    dat['the mountain']=(updown.uphill);
    dat['down']=(updown.downhill)
    timebound=gpx.get_time_bounds();
    dat['Start time']=(timebound.start_time)
    dat['End time']=(timebound.end_time)
    p=gpx.get_points_data()[0]
    dat['lat']=p.point.latitude
    dat['lng']=p.point.longitude
    file.close()    return datCopy the code


The readgpx function reads the file name X and returns a dictionary. And get a table like the following:



Since we only need to plot the area of Beijing, we need a coordinate expression to screen out the area outside Beijing. The filter code uses PANDAS, which is more detailed in the attachment.


Exceptids = [(detailed detailed. LNG < 116.1) | | (detailed. LNG > 116.7) (detailed. Lat < 39.9) | (detailed. Lat > 40.1)]. Id


def filtercity(r):    
sp=r.split('/')[-1].split('. ')    
ifsp[1]! ='gpx':        
return False;    
if sp[0] in exceptids.values:        
return False;    
return True;
bjids= [r for r in gpxs if filtercity(r)]Copy the code


So, we’ve sifted through all the sports data that we’ve done in Beijing.



4. Draw GPS data


It’s not fun to make a wheel over and over again. There is a powerful library for drawing GPX

http://avtanski.net/projects/gps/


Unfortunately, the library uses Perl as the development language and GD as the visual rendering library. I spent a lot of time installing GD.


Ubuntu has Perl installed by default, GD requires libgd, libgd is very difficult to download on the official website, and I found out that the version is not correct, which caused me to surf the Internet for several hours and almost died. In the end, I realized that installing the libgd library was as simple as this:


apt-get install libgd-gd2-perl


You can’t find apt get GD or libgd, if you don’t check it, who knows how to write it? As for Perl’s CPan management tool, alas, there are tears.

Next, download GD 2.56, and after unzipping it,


perl ./Makefile.PL

make

make install


Can be

The GPX drawing library introduces itself as follows:


This folder contains several Perl scripts for processing and plottin GPS track data in .GPX format.


Copy all the GPX data into the sample_gpx folder and run it beautifully


./runme.sh


If there are no questions, it should look like this:

Copy the code

I assume that you are already familiar with bash, and you can see runme.sh for more requirements.


The final results are as follows:

Copy the code


When I saw this result, I was shocked! This is the result of running about 2000 kilometers, Beijing’s third ring Road (mainly concentrated in the north of Chang ‘an Avenue), the main roads are all over, chaoyang Park, temple of Heaven park, especially the north third ring road and Beitucheng Road (the north section of line 10) were abused by me. Every white line is a story, every point is a footprint of my ah!


5. To summarize


Hand by hand, this article is not detailed enough. It doesn’t offer much in the way of data analysis (obviously I’ve done all that), but I’m sure running programmers are pretty good at it, so I’m just going to throw some light on it.


In fact, it can be made into a Web service, runners upload their running software ID, can automatically render a variety of beautiful running path and analysis chart, should be very meaningful!


It took me seven or eight hours to do this, and I was bleeding, and a lot of that time was spent installing GD, not downloading the data. The lesson taught me that I must read the documentation in the installation package, because the version between the library and the library is different, so it may cause version hell, when the new version can not be uninstalled, the old version can not be used, don’t say I didn’t remind ah!


It is worth mentioning that the GPX file downloaded by GPS does not have a line break, which results in the gpX_disualization library not being able to parse it (the regular expression was incorrectly written). I am tired of playing with the Perl regular, so I added a line break by substitution.


— the End —