Použité technologie: Pandas, Python, SQL, Jupyter.
Proveťe analýzu poptávky:
edinburgh_weather
),
Během jara a léta byl vliv počasí na výpůjčky kol v Edinburghu spíše méně významný. Během sledovaného období bylo počasí bez vichřic, extrémních teplot, nebo mimořádně vydatných srážek. Zdá se, že větší vliv na výpůjčky mělo pořádání eventů ve městě.
Efekt teploty vzduchu na poptávku je střední: R = 0,5.
Během hlavní sezóny se vliv teploty vzduchu ztrácí: R = 0,36.
Po odstranění vlivu eventů ve městě je vliv počasí na výpůjčky ještě menší: R = 0,31.
Lidé si kola půjčují o víkendu trochu
více, než během pracovního týdne.
Průměrný denní počet výpůjček během pracovních dnů: 387 Průměrný denní počet výpůjček během víkendu: 443
Doba výpůjčky mezi 5 a 20 minutami jednoznačně dominuje počtem kolem 180 tis. záznamů z celkových 438 259 výpůjček.
Asi 80% lidí vrátí kolo do 60 minut.
Deset nejfrekventovanějších stanic (výpůjčky + vratky)
Jedna z možností tabulky vzdáleností mezi stanicemi.
Načtení dat do Jupyter notebook.
Výstupem je tabulka s daty o výpůjčkách kol. Vidíme identifikace stanic výpůjček a vrácení včetně jejich názvů, popisu, souřadnic a doby výpůjčky v sekundách.
IN:
alchemy_conn = sqlalchemy.create_engine(conn_string)
edbikes_df = pd.read_sql(‚edinburgh_bikes‘,conn_string,index_col=[‚index‘])
edbikes_df.head()
OUT:
index | started | ended | duration | start station id | start station name | start station description | start station latitude | start station longitude | end station id | end station name | end station description | end station latitude | end station longitude |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|
0 | 2018-09-15 08:52:05 | 2018-09-15 09:11:48 | 1182 | 247 | Charlotte Square | North Corner of Charlotte Square | 55.952335 | -3.207101 | 259 | St Andrew Square | North East corner | 55.954728 | -3.192653 |
1 | 2018-09-15 09:24:33 | 2018-09-15 09:41:09 | 995 | 259 | St Andrew Square | North East corner | 55.954749 | -3.192774 | 262 | Canonmills | near Tesco’s | 55.962804 | -3.196284 |
2 | 2018-09-15 09:48:54 | 2018-09-15 10:46:40 | 3466 | 262 | Canonmills | near Tesco’s | 55.962804 | -3.196284 | 250 | Victoria Quay | Entrance to Scottish Government Office | 55.977638 | -3.174116 |
3 | 2018-09-16 12:01:36 | 2018-09-16 12:25:26 | 1430 | 255 | Kings Buildings 4 | X-Y Cafe | 55.922001 | -3.176902 | 254 | Kings Building 3 | Kings Building House | 55.923479 | -3.175385 |
4 | 2018-09-16 12:03:43 | 2018-09-16 12:11:16 | 452 | 255 | Kings Buildings 4 | X-Y Cafe | 55.922001 | -3.176902 | 253 | Kings Building 2 | Sanderson Building | 55.923202 | -3.171646 |
Informace o datech.
Tabulká má 438 259 záznamů. Některé sloupce mají nižší počty záznamů, takže v nich data chybí.
IN:
edbikes_df.info()
OUT:
<class 'pandas.core.frame.DataFrame'> Int64Index: 438259 entries, 0 to 12640 Data columns (total 13 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 started_at 438259 non-null object 1 ended_at 438259 non-null object 2 duration 438259 non-null int64 3 start_station_id 438259 non-null int64 4 start_station_name 438259 non-null object 5 start_station_description 435549 non-null object 6 start_station_latitude 438259 non-null float64 7 start_station_longitude 438259 non-null float64 8 end_station_id 438259 non-null int64 9 end_station_name 438259 non-null object 10 end_station_description 435256 non-null object 11 end_station_latitude 438259 non-null float64 12 end_station_longitude 438259 non-null float64 dtypes: float64(4), int64(3), object(6) memory usage: 46.8+ MB
Záznamy jsou z období od 15.9.2018 do 30.6.2021.
IN:
edbikes_df[[‚started_at‘,’ended_at‘]].sort_values(‚started_at‘)
OUT:
index | started_at | ended_at |
---|---|---|
0 | 2018-09-15 08:52:05 | 2018-09-15 09:11:48 |
1 | 2018-09-15 09:24:33 | 2018-09-15 09:41:09 |
2 | 2018-09-15 09:48:54 | 2018-09-15 10:46:40 |
3 | 2018-09-16 12:01:36 | 2018-09-16 12:25:26 |
4 | 2018-09-16 12:03:43 | 2018-09-16 12:11:16 |
… | … | … |
12636 | 2021-06-30 23:30:31 | 2021-07-01 00:06:10 |
12637 | 2021-06-30 23:36:16 | 2021-07-01 00:05:40 |
12638 | 2021-06-30 23:49:03 | 2021-07-01 00:11:25 |
12639 | 2021-06-30 23:49:03 | 2021-07-01 00:11:52 |
12640 | 2021-06-30 23:58:33 | 2021-07-01 00:07:15 |
438259 rows × 2 columns
Hledání chybějících hodnot.
Ve sloupci start_station_description chybí 2710 hodnot a ve sloupci end_station_description jich chybí 3003.
IN:
edbikes_df.isna().sum()
OUT:
started_at 0 ended_at 0 duration 0 start_station_id 0 start_station_name 0 start_station_description 2710 start_station_latitude 0 start_station_longitude 0 end_station_id 0 end_station_name 0 end_station_description 3003 end_station_latitude 0 end_station_longitude 0 dtype: int64
Identifikace řádků s chybějícími hodnotami v start_station_description.
Hodnoty chybí v počátečních stanicích s názvem ‚Dalry Road Lidl‘.
IN:
condition = edbikes_df[[‚start_station_name‘,’start_station_description‘]].isna().any(axis=1)
edbikes_df.loc[condition,’start_station_name‘].unique()
OUT:
array(['Dalry Road Lidl'], dtype=object)
Identifikace řádků s chybějícími hodnotami v end_station_description.
Hodnoty chybí v konečných stanicích s názvem ‚Dalry Road Lidl‘.
IN:
condition = edbikes_df[[‚end_station_name‘,’end_station_description‘]].isna().any(axis=1)
edbikes_df.loc[condition,’end_station_name‘].unique()
OUT:
array(['Dalry Road Lidl'], dtype=object)
Doplnění chybějících hodnot do tabulky.
IN:
edbikes_df.fillna(‚Dalry Road Lidl‘,inplace=True)
edbikes_df.info()
OUT:
<class 'pandas.core.frame.DataFrame'> Int64Index: 438259 entries, 0 to 12640 Data columns (total 13 columns): # Column Non-Null Count Dtype --- ------ -------------- ----- 0 started_at 438259 non-null object 1 ended_at 438259 non-null object 2 duration 438259 non-null int64 3 start_station_id 438259 non-null int64 4 start_station_name 438259 non-null object 5 start_station_description 438259 non-null object 6 start_station_latitude 438259 non-null float64 7 start_station_longitude 438259 non-null float64 8 end_station_id 438259 non-null int64 9 end_station_name 438259 non-null object 10 end_station_description 438259 non-null object 11 end_station_latitude 438259 non-null float64 12 end_station_longitude 438259 non-null float64 dtypes: float64(4), int64(3), object(6)
Výpis počátečních stanic: ID, jméno, popis, souřadnice.
drop_duplicates() vyřadí duplicity (srovnává řádky tabulky jako celek).
IN:
start = (edbikes_df[[‚start_station_id‘,’start_station_name‘,’start_station_latitude‘,’start_station_longitude‘, ‚start_station_description‘]].drop_duplicates()
.rename(columns={‚start_station_id‘:’station_id‘,’start_station_name‘:’station_name‘,’start_station_latitude‘:’latitude‘,
‚start_station_longitude‘:’longitude‘,’start_station_description‘:’station_description‘})
)
print(‚Start stations:‘)
print(‚\n‘,start)
OUT:
Start stations: station_id station_name latitude longitude \ index 0 247 Charlotte Square 55.952335 -3.207101 1 259 St Andrew Square 55.954749 -3.192774 2 262 Canonmills 55.962804 -3.196284 3 255 Kings Buildings 4 55.922001 -3.176902 5 253 Kings Building 2 55.923202 -3.171646 ... ... ... ... ... 5853 2268 Picady Place 55.956535 -3.186248 5970 2268 Picardy Place 55.956535 -3.186248 3490 2265 Musselburgh Brunton Hall 55.943961 -3.058307 33 358 Leith Walk 55.965176 -3.176180 9257 1739 Edinburgh Royal Infirmary 55.921354 -3.136971 station_description index 0 North Corner of Charlotte Square 1 North East corner 2 near Tesco's 3 X-Y Cafe 5 Sanderson Building ... ... 5853 Outside Omni Centre 5970 Outside Omni Centre 3490 Adjacent to the Brunton Theatre 33 Leith Walk opposite Dalmeny Street 9257 Front of new Sick Kids Hospital [241 rows x 5 columns]
Výpis konečných stanic: ID, jméno, popis, souřadnice.
drop_duplicates() vyřadí duplicity (pouze pokud se hodnoty shodují ve všech vypsaných sloupcích).
IN:
end = (edbikes_df[[‚end_station_id‘,’end_station_name‘,’end_station_latitude‘,’end_station_longitude‘,’end_station_description‘]].drop_duplicates()
.rename(columns={‚end_station_id‘:’station_id‘,’end_station_name‘:’station_name‘,’end_station_latitude‘:’latitude‘,
‚end_station_longitude‘:’longitude‘,’end_station_description‘:’station_description‘})
)
print(„End stations:“)
print(‚\n‘,end)
OUT:
End stations: station_id station_name latitude longitude \ index 0 259 St Andrew Square 55.954728 -3.192653 1 262 Canonmills 55.962804 -3.196284 2 250 Victoria Quay 55.977638 -3.174116 3 254 Kings Building 3 55.923479 -3.175385 4 253 Kings Building 2 55.923202 -3.171646 ... ... ... ... ... 5806 2268 Picady Place 55.956535 -3.186248 5963 2268 Picardy Place 55.956535 -3.186248 3184 2265 Musselburgh Brunton Hall 55.943961 -3.058307 12970 358 Leith Walk 55.965176 -3.176180 9272 1739 Edinburgh Royal Infirmary 55.921354 -3.136971 station_description index 0 North East corner 1 near Tesco's 2 Entrance to Scottish Government Office 3 Kings Building House 4 Sanderson Building ... ... 5806 Outside Omni Centre 5963 Outside Omni Centre 3184 Adjacent to the Brunton Theatre 12970 Leith Walk opposite Dalmeny Street 9272 Front of new Sick Kids Hospital [243 rows x 5 columns]
Vytvoření data frame ‚stations‘ spojením ‚start‘ a ‚end‘ stanic.
!! Průzkum a úprava dat je i o čtení a hledáních chyb, překlepů a jejich nápravě!!
V seznamu jsou stanice se stejným názvem, ale pod různým ID. Některé stanice stanice se stejným názvem se jen mírně liší souřadnicemi, nebo popisem.
IN:
stations = pd.concat([end,start]).drop_duplicates().sort_values(‚station_id‘)
print(„All stations at least once used to start or once to end:“)
stations.sort_values(‚station_id‘)
OUT:
All stations at least once used to start or once to end:
station_id | station_name | latitude | longitude | station_description | |
---|---|---|---|---|---|
index | |||||
27 | 171 | George Square | 55.943084 | -3.188311 | George Square, south side in front of University library |
48 | 183 | Waverley Bridge | 55.951344 | -3.191421 | near the top of the South ramp |
16696 | 183 | Waverley Bridge | 55.951981 | -3.191890 | near the top of the South ramp |
14 | 189 | City Chambers | 55.950109 | -3.190258 | City Chambers Quadrangle |
135 | 225 | Waverley Court | 55.951734 | -3.184179 | On Waverley Court forecourt |
1278 | 241 | Depot | 55.972373 | -3.155833 | Depot |
1126 | 242 | Virtual Depot | 55.972402 | -3.155794 | Virtual Depot |
10 | 246 | Royal Commonwealth Pool | 55.939000 | -3.173924 | Royal Commonwealth Pool Entrance |
28 | 247 | Charlotte Square | 55.952335 | -3.207101 | North Corner of Charlotte Square |
608 | 248 | Bristo Square | 55.946004 | -3.188665 | Bristo Square, near Potterrow |
6766 | 248 | Bristo Square | 55.945834 | -3.189053 | Bristo Square, near Potterrow |
12 | 248 | Bristo Square | 55.946004 | -3.188665 | Bristo Square, near Potter Row |
66 | 249 | Fountainbridge | 55.943357 | -3.209248 | Fountainbridge by Gardner’s Crescent |
2 | 250 | Victoria Quay | 55.977638 | -3.174116 | Entrance to Scottish Government Office |
1696 | 250 | Victoria Quay | 55.977617 | -3.174126 | Entrance to Scottish Government Office |
1001 | 251 | Waverley Station | 55.952641 | -3.187527 | Platform level near Calton Road Exit |
119 | 251 | Waverley Station | 55.952641 | -3.187527 | Near Calton Road Exit |
5 | 252 | Kings Buildings 1 | 55.924185 | -3.173831 | Grant Institute |
4 | 253 | Kings Building 2 | 55.923202 | -3.171646 | Sanderson Building |
3 | 254 | Kings Building 3 | 55.923479 | -3.175385 | Kings Building House |
138 | 255 | Kings Buildings 4 | 55.922001 | -3.176902 | X-Y Cafe |
146 | 256 | St Andrews House | 55.953164 | -3.181682 | beside Jacobs ladder |
16 | 257 | Portobello | 55.957004 | -3.116888 | Edinburgh Leisure Tumbles Centre |
13 | 258 | Stockbridge | 55.958566 | -3.208070 | Hamilton Place by Dean Bank Lane |
1798 | 259 | St Andrew Square | 55.954906 | -3.192444 | North East corner |
0 | 259 | St Andrew Square | 55.954728 | -3.192653 | North East corner |
24 | 259 | St Andrew Square | 55.954749 | -3.192774 | North East corner |
147 | 260 | Lauriston Place | 55.944772 | -3.197266 | near Chalmers Street |
2024 | 261 | Brunswick Place – Virtual | 55.960930 | -3.181005 | Junction Brunswick Street and Elm Row |
68 | 261 | Brunswick Place | 55.960930 | -3.181005 | Junction Brunswick Street and Elm Row |
1 | 262 | Canonmills | 55.962804 | -3.196284 | near Tesco’s |
3868 | 264 | Pollock Halls | 55.940081 | -3.171747 | Pollock Halls Entrance |
7 | 264 | Pollock Halls | 55.939963 | -3.171586 | Pollock Halls Entrance |
11 | 265 | Meadows East | 55.939809 | -3.182739 | Melville Terrace |
1638 | 265 | Meadows East | 55.939809 | -3.182739 | Between Melville Terrace and Sciennes |
178 | 266 | Victoria Park | 55.974247 | -3.194482 | Near Craighall Road |
3788 | 266 | Victoria Park | 55.975189 | -3.191030 | Near Craighall Road |
25 | 267 | Launch Day Event | 55.942551 | -3.191381 | The Meadows |
227 | 273 | Shrubhill | 55.962537 | -3.179473 | Sainsbury’s Leith Walk |
356 | 275 | Riego Street | 55.945160 | -3.203678 | Riego Street on junction with East Fountainbridge |
1206 | 277 | Waitrose, Comely Bank | 55.959504 | -3.223428 | Fettes Avenue |
4304 | 280 | Smarter Travel Station | 53.395525 | -2.990138 | The Street |
1945 | 284 | Leith Links | 55.969199 | -3.166885 | Corner of Vanburgh Place and Lochend Road |
2064 | 285 | Ocean Terminal | 55.981286 | -3.176351 | Debenhams / Puregym entrance |
3140 | 289 | Castle Street | 55.951501 | -3.203184 | Near Rose Street |
3213 | 290 | Bruntsfield | 55.937159 | -3.206435 | Corner of Bruntsfield Terrace and Bruntsfield Place |
12115 | 290 | Bruntsfield links | 55.937159 | -3.206435 | Corner of Bruntsfield Terrace and Bruntsfield Place |
3931 | 296 | Castle Terrace | 55.946766 | -3.202038 | Corner of Castle Terrace and Lady Lawson St |
3594 | 297 | Royal Infirmary | 55.924295 | -3.133510 | Cycleway off Little France |
9928 | 299 | Depot Virtual | 55.972335 | -3.155782 | Depot internal station |
1729 | 340 | Meadow Place | 55.940300 | -3.194592 | Corner of Meadow Place and Melville Drive |
1092 | 341 | Warrender Park Road | 55.938363 | -3.198031 | Corner of Warrender Park Road & Spottiswoode Street |
1306 | 342 | Whitehouse Loan | 55.936329 | -3.202295 | Corner of Whitehouse Loan & Bruntsfield Crescent |
1732 | 343 | Thirlestane Road | 55.935365 | -3.198671 | Corner of Thirlestane Road & St. Margaret’s Place |
1434 | 344 | Marchmont Crescent | 55.936397 | -3.194252 | Corner of Marchmont Crescent & Marchmont Road |
1954 | 345 | Colinton Road | 55.934035 | -3.210803 | Colinton Road outside Starbucks |
7112 | 345 | Colinton Road | 55.933398 | -3.212429 | Colinton Road outside Starbucks |
7756 | 345 | Colinton Road | 55.933398 | -3.212429 | Colinton Road next to Napier University |
1700 | 346 | Morningside Road | 55.927986 | -3.209739 | Corner of Morningside Road & Morningside Park by M & S car park |
1061 | 347 | Simon Square | 55.944868 | -3.182671 | Corner of Simon Square & Gilmour Street |
1823 | 349 | Orchard Brae House | 55.955083 | -3.223634 | Outside Orchard Brae House on Queensferry Road |
1252 | 350 | Dalry Road Lidl | 55.941754 | -3.222524 | outside Lidl |
1264 | 351 | Dalry Road Co-op | 55.942715 | -3.221431 | By co-op (outside maplins) |
1223 | 352 | Dundee Terrace | 55.939729 | -3.220603 | Dundee Terrace outside „glass supplies“ |
1221 | 353 | Gibson Terrace | 55.940493 | -3.217144 | Outside student accomodation |
1609 | 354 | South Trinity Road | 55.971269 | -3.207816 | Corner of South Trinity Road & Ferry Road |
1614 | 355 | Inverleith Row | 55.964146 | -3.202074 | Corner of Inverleith Row & Inverleith Terrace |
1329 | 356 | East London Street | 55.959943 | -3.187329 | Outside St. Mary’s Primary School |
1309 | 357 | Macdonald Road | 55.963995 | -3.185189 | Macdonald Road |
1328 | 358 | Leith Walk | 55.965040 | -3.176686 | Leith Walk opposite Dalmeny Street |
12970 | 358 | Leith Walk | 55.965176 | -3.176180 | Leith Walk opposite Dalmeny Street |
1501 | 359 | Causewayside | 55.936430 | -3.180115 | Causewayside outside NLS |
1424 | 365 | Novotel | 55.944896 | -3.199635 | Novotel |
1963 | 366 | Newhaven Road / Dudley Gardens | 55.975921 | -3.191346 | Corner of Dudley Gardens on Newhaven Road |
1676 | 366 | Dudley Gardens | 55.975921 | -3.191346 | Corner of Dudley Gardens on Newhaven Road |
1714 | 366 | Newhaven Road / Dudley Gardens | 55.975927 | -3.191318 | Corner of Dudley Gardens on Newhaven Road |
5831 | 648 | Western General | 55.963458 | -3.232810 | Western General virtual station across from public multi storey |
1523 | 820 | Newkirkgate | 55.970704 | -3.171624 | 13 point docking station beside queen victoria monument |
5608 | 820 | Newkirkgate | 55.970704 | -3.171624 | 19 point docking station beside queen victoria monument |
2941 | 860 | Pollock Halls Virtual | 55.940348 | -3.172108 | On pavement corner by the entrance gates |
6077 | 862 | Cramond Foreshore | 55.980024 | -3.300622 | Cramond foreshore by turning circle |
7405 | 863 | Gamekeeper’s Road | 55.969532 | -3.307305 | Corner of Gamekeeper’s Road & Whitehouse Road |
6633 | 864 | Whitehouse Road | 55.961527 | -3.306114 | Whitehouse road by Sainbury’s & bus stop |
7207 | 865 | Craigleith Road | 55.956619 | -3.237803 | Craigleith road by bus stop |
6404 | 866 | Comely Bank Road | 55.959407 | -3.215660 | Between trees on Comely Bank Road |
6270 | 867 | Henderson Row | 55.960260 | -3.203913 | Next to the Edinburgh Academy |
5825 | 868 | Dundas Street | 55.960944 | -3.201387 | Corner of Dundas Street & Henderson Row |
6058 | 869 | Hillside Crescent 1 | 55.957910 | -3.180693 | West corner of Hillside Crescent & London Road |
12540 | 870 | Hillside Crescent 2 | 55.957793 | -3.175799 | East corner of Hillside Crescent & London Road |
3860 | 870 | Hillside Crescent | 55.957793 | -3.175799 | East corner of Hillside Crescent & London Road |
5899 | 870 | Hillside Crescent 2 | 55.957803 | -3.175479 | East corner of Hillside Crescent & London Road |
8094 | 871 | St. John’s Road 1 | 55.942746 | -3.281623 | Outside Specsavers |
6710 | 872 | St. John’s Road 2 | 55.942945 | -3.290794 | Outside GuitarGuitar |
6607 | 873 | Corstorphine Road | 55.941743 | -3.271484 | Outside Forestry & Land Scotland |
5179 | 874 | Edinburgh Zoo | 55.942120 | -3.269280 | Outside zoo gatehouse |
7224 | 874 | Edinburgh Zoo | 55.942222 | -3.268697 | Outside zoo gatehouse |
741 | 875 | Corstorphine Road – Pinkhill | 55.942364 | -3.265384 | Corner of Corstorphine Road & Pinkhill |
7709 | 876 | Murrayfield | 55.944767 | -3.243688 | Riversdale Crescent |
6855 | 877 | Murrayfield Avenue | 55.946322 | -3.236542 | Corner of Murrayfield Avenue & Murrayfield Place |
6415 | 878 | Balgreen | 55.938938 | -3.251173 | Opposite Jenners Depository |
6313 | 879 | Gladstone Terrace | 55.938024 | -3.184979 | Corner of Gladstone Terrace & Sciennes ROad |
8602 | 880 | Logie Green Road | 55.964035 | -3.195674 | Outside Lidl |
6671 | 881 | Tollcross | 55.944281 | -3.202964 | Outside Piccolino |
6603 | 882 | EICC | 55.946043 | -3.210485 | Between planters next to EICC |
227 | 883 | Queensferry Road | 55.954543 | -3.234728 | Corner of Queensferry Road & Orchard Road |
99 | 884 | IGMM | 55.962487 | -3.232031 | IGMM at Western General Hospital |
6982 | 885 | Wester Coates Terrace | 55.945609 | -3.231716 | Corner of Wester Coates Terrace & Roseburn Terrace |
6604 | 887 | Roseburn Street | 55.944466 | -3.234541 | Corner of Roseburn Street & Russell Gardens |
6573 | 888 | Crichton Street | 55.944750 | -3.186542 | Corner of Crichton Street & Potterow |
7320 | 889 | Murrayfield Tram | 55.941956 | -3.237802 | By the side of the station entrance |
831 | 890 | West Crosscauseway | 55.943862 | -3.184972 | Corner of West Crosscauseway & Buccleugh Street |
840 | 891 | West Newington Place | 55.938212 | -3.178972 | Corner of West Newington Place & Newington Road |
1625 | 901 | Dunbar’s Close Garden | 55.951814 | -3.178726 | Inside garden |
3569 | 964 | Corn Exchange – walk cycle event | 55.927014 | -3.248557 | Temporary event station |
3619 | 965 | Sustrans – walk cycle event | 55.945452 | -3.219680 | Sustrans car parking space 31 for walk cycle event |
12913 | 965 | Haymarket – Murrayfield Rugby Event | 55.945452 | -3.219680 | Sustrans car parking space 31 for Murrayfield Rugby Event |
7049 | 980 | Royal Highland Show – East Gate (19th to 23rd June) | 55.940907 | -3.368872 | Virtual station at east gate of RHC (19th to 23rd June) |
4109 | 981 | RHC – Edinburgh Festival Camping (05th to 26th August) | 55.940655 | -3.381606 | Virtual station for Edinburgh Festival Camping at west gate of RHC (05th to 26th August) |
6622 | 981 | Royal Highland Show – West Gate (19th to 23rd June) | 55.940655 | -3.381606 | Virtual station at west gate of RHC (19th to 23rd June) |
6611 | 982 | Ingliston Park and Ride (19th to 23rd June) | 55.939090 | -3.355913 | Virtual station at Ingliston Park and Ride (19th to 23rd June) |
2011 | 991 | Meadows – Edinburgh Climate Festival | 55.941736 | -3.190361 | 20 bike virtual station for edinburgh climate festival |
9510 | 1017 | Crichton Street | 55.944756 | -3.186917 | 13 point physical docking station |
193 | 1017 | Crichton Street | 55.944784 | -3.186906 | East end of street |
10546 | 1017 | Crichton Street | 55.944756 | -3.186917 | East end of street |
10222 | 1018 | Hunter Square | 55.949692 | -3.187813 | Next to Tron Kirk at top of Blair Street |
9482 | 1018 | Hunter Square | 55.949692 | -3.187813 | 07 point angled physical docking station |
10308 | 1019 | Grassmarket | 55.947097 | -3.197246 | West end of Grassmarket |
11964 | 1024 | Meadow Place 2 | 55.940238 | -3.194640 | |
2037 | 1024 | Meadow Place | 55.940238 | -3.194640 | End of Meadow Place |
1938 | 1025 | Dundee Terrace | 55.939710 | -3.220589 | Corner of Dundee Street & Dundee Terrace |
12039 | 1025 | Dundee Terrace | 55.939710 | -3.220589 | |
4605 | 1026 | Constitution Street | 55.975360 | -3.166442 | Corner of Assembly Street |
12380 | 1026 | Constitution Street | 55.975441 | -3.166806 | Next to Burns Statue |
664 | 1027 | Drummond Street | 55.948337 | -3.182261 | Opposite the Pleasance |
2140 | 1027 | Drummond Street | 55.948337 | -3.182261 | Opposite the Pleasance Grand |
8255 | 1028 | Hunter Square | 55.949798 | -3.187795 | In Hunter Square next to Tron Kirk |
3098 | 1028 | The Tron | 55.950037 | -3.187822 | Next to Tron Kirk Royal Mile |
4936 | 1030 | Fountain Court – Apartments (RESIDENTS ONLY) | 55.943806 | -3.211238 | Fountain Court – Apartments (RESIDENTS ONLY) |
7063 | 1031 | Eden Locke – Aparthotel (RESIDENTS ONLY) | 55.952641 | -3.205666 | Eden Locke – Aparthotel (RESIDENTS ONLY) |
8337 | 1031 | Eden Locke – Aparthotel (RESIDENTS ONLY) | 55.952619 | -3.205678 | Eden Locke – Aparthotel (RESIDENTS ONLY) |
5613 | 1032 | Holyrood Park – Woman’s Tour Of Scotland (Event 11/08/19) | 55.951354 | -3.168591 | Holyrood Park – Woman’s Tour Of Scotland (Event 11/08/19) |
6481 | 1033 | Queen Margaret University | 55.931935 | -3.073046 | Opposite Maggies student union |
16730 | 1038 | South Trinity Road | 55.971325 | -3.207964 | Corner of South Trinity Road and Ferry Road |
16578 | 1039 | Lothian Road | 55.947409 | -3.205765 | Outside the Usher Hall |
1995 | 1040 | Sighthill – Edinburgh College | 55.926665 | -3.289468 | Sighthill Campus |
1016 | 1041 | Milton Road – Edinburgh College | 55.944051 | -3.098567 | Milton Road Campus |
1632 | 1042 | Kings Buildings – Murchison House | 55.924464 | -3.178732 | West end of campus |
1538 | 1050 | EICC | 55.946071 | -3.210396 | Outside Edinburgh International Conference Centre |
1546 | 1051 | Warrender Park Road | 55.938369 | -3.198033 | On corner with Spottiswoode Street |
4146 | 1052 | Surgeons Hall | 55.946643 | -3.185475 | East side of Nicolson Street |
1962 | 1055 | Roseburn Street | 55.944426 | -3.234498 | On corner with Russell Gardens |
3879 | 1056 | Fort Kinnaird | 55.934372 | -3.105038 | Next to Playpark/Cafe Nero |
4188 | 1057 | Pleasance – Edinburgh University Sports Fair | 55.948210 | -3.181597 | Pleasance Sports Complex |
4726 | 1090 | Hillside Crescent | 55.957872 | -3.175888 | East end of Hillside Crescent |
4626 | 1091 | Holyrood Road | 55.949560 | -3.180413 | Opposite St Leonards Land |
2990 | 1092 | Dalry Road Lidl | 55.941791 | -3.222415 | Dalry Road Lidl |
4848 | 1092 | Dalry Road Lidl | 55.941791 | -3.222415 | |
7991 | 1093 | Belford Road | 55.951974 | -3.226125 | Outside Scottish National Gallery of Modern Art |
4757 | 1093 | Belford Road | 55.951974 | -3.226125 | Outside Scottish National Gallery |
10199 | 1094 | HSBC UK Lets Ride – Meadows Event | 55.939978 | -3.189862 | 20 point virtual docking station for UK lets Ride Event |
12099 | 1095 | Dudley Gardens | 55.975940 | -3.191321 | Opposite Victoria Park on Newhaven Road |
12115 | 1096 | West Crosscauseway | 55.943836 | -3.184951 | On island next to Buccleuch Street |
12165 | 1097 | Gladstone Terrace | 55.937963 | -3.185021 | Corner of Gladstone Terrace and Sciennes Road |
12228 | 1098 | Marchmont Crescent | 55.936432 | -3.194150 | Corner of Marchmont Road |
1683 | 1102 | Haymarket Station | 55.945569 | -3.218185 | Haymarket Station |
969 | 1102 | Haymarket Station | 55.945575 | -3.218195 | Haymarket Station |
154 | 1102 | Haymarket Station | 55.945582 | -3.218192 | Haymarket Station |
1305 | 1720 | Dundas Street | 55.960762 | -3.201278 | On corner of Henderson Row |
1194 | 1721 | Tollcross | 55.944248 | -3.203105 | Outside Piccolinos off Lauriston Place |
2070 | 1722 | Cramond Foreshore | 55.980031 | -3.300642 | Near Cramond Beach |
1873 | 1723 | Heriot Watt – Edinburgh Business School | 55.908786 | -3.320170 | Heriot Watt – Edinburgh Business School |
1555 | 1723 | Heriot Watt – Edinburgh Business School | 55.908810 | -3.320142 | Heriot Watt – Edinburgh Business School |
1491 | 1724 | Heriot Watt – Student Accomodation (Anna MacLeod Halls) | 55.908404 | -3.328825 | Heriot Watt – Student Accomodation Heriot Watt – Student Accomodation (Anna MacLeod Halls)) |
2987 | 1725 | Edinburgh Zoo | 55.942115 | -3.269287 | Corstorphine Road, Edinburgh |
4091 | 1726 | Simon Square | 55.944859 | -3.182590 | Between Pleasance and Nicholson Street |
4083 | 1727 | Causewayside | 55.936506 | -3.180166 | Outside National Library of Scotland |
9581 | 1728 | Portobello – Kings Road | 55.957915 | -3.118332 | Foot of Kings Road next to the promenade |
9912 | 1729 | McDonald Road | 55.964031 | -3.185175 | Next to Fire Station |
9689 | 1730 | East London Street | 55.959954 | -3.187198 | Outside St Mary’s Primary School |
10018 | 1731 | Pleasance Courtyard | 55.947567 | -3.181592 | Edinburgh University, The Pleasance |
5106 | 1731 | Pleasance Courtyard | 55.947491 | -3.181112 | Edinburgh University, The Pleasance |
12637 | 1737 | Inverleith Row | 55.964118 | -3.202095 | Corner with Inverleith Terrace |
12959 | 1738 | Wester Coates Terrace | 55.945648 | -3.231847 | Junction with Roseburn Terrace |
6642 | 1739 | Edinburgh Royal Infirmary | 55.920632 | -3.140541 | Little France, Old Dalkeith Road, Edinburgh |
9272 | 1739 | Edinburgh Royal Infirmary | 55.921354 | -3.136971 | Front of new Sick Kids Hospital |
2302 | 1739 | Edinburgh Royal Infirmary | 55.921220 | -3.139076 | Front of new Sick Kids Hospital |
873 | 1740 | Cycling Scotland Conference | 55.940886 | -3.240778 | Murrayfield Stadium |
2523 | 1743 | Logie Green Road | 55.964058 | -3.195700 | Outside Lidl |
1975 | 1744 | Morningside Road | 55.927985 | -3.209750 | Corner of Morningside Park, beside M&S |
2207 | 1745 | Scotland Street | 55.960380 | -3.195470 | Corner with Royal Crescent |
1984 | 1746 | Crescent House | 55.963911 | -3.191862 | Crescent House |
4563 | 1747 | Corstorphine Road | 55.941670 | -3.271524 | Outside Silvan House, Forestry and Land Scotland |
4602 | 1748 | Colinton Road | 55.933416 | -3.212397 | Near Napier University Merchiston Campus |
5210 | 1749 | Dean Street | 55.957278 | -3.214285 | Corner of Dean Park Mews |
6911 | 1752 | IGMM – Western General | 55.962642 | -3.231916 | The Institute of Genetics and Molecular Medicine |
7129 | 1753 | Waitrose Comely Bank | 55.959536 | -3.223434 | On Fettes Avenue next to Waitrose |
7876 | 1754 | Murrayfield Tram | 55.941911 | -3.237800 | Next to Murrayfield Tram Station |
60 | 1756 | Western General Hospital | 55.963454 | -3.232909 | Porterfield Road |
11547 | 1756 | Western General Hospital | 55.962840 | -3.234136 | Porterfield Road |
9904 | 1757 | Meggetland | 55.927587 | -3.233671 | Meggetland Sports Complex |
9253 | 1758 | Queen Margaret University | 55.931980 | -3.073105 | University Courtyard |
1140 | 1763 | Comely Bank Road | 55.959410 | -3.215661 | Outside Raeburn Place Sports Ground |
1305 | 1764 | Craigleith Road | 55.956576 | -3.237940 | Next to Craigleith Hill Bus Stop |
1553 | 1765 | Haymarket Terrace | 55.946064 | -3.223024 | Corner of Magdala Crescent |
3484 | 1766 | Balgreen Road | 55.938942 | -3.251111 | Opposite Jenners Depository |
2993 | 1767 | Bruntsfield Links | 55.937123 | -3.206432 | Corner of Bruntsfield Links next to Public Toilets |
3080 | 1768 | Thirlestane Road | 55.935324 | -3.198763 | On corner with St Margaret’s Place |
4906 | 1769 | Brunswick Place | 55.960852 | -3.180986 | Corner of Elm Row/Brunswick Street |
5049 | 1770 | Ellersly Road | 55.945046 | -3.250881 | Corner with A8 Corstorphine Road at Western Terrace |
2022 | 1798 | Chambers Street | 55.947600 | -3.188920 | Outside National Museum |
2004 | 1799 | Murrayfield | 55.944791 | -3.243673 | Near Stadium and Ice Rink |
1936 | 1800 | Joppa | 55.948949 | -3.094727 | East end of Promenade |
4346 | 1807 | Gamekeeper’s Road | 55.969443 | -3.307259 | Junction with Whitehouse Road |
5068 | 1808 | Gorgie Road | 55.938741 | -3.229909 | Corner with McLeod Street |
6158 | 1809 | Royal Edinburgh Hospital | 55.927818 | -3.213308 | Next to Kennedy Tower |
6682 | 1813 | Milton Road – Edinburgh College | 55.944066 | -3.098561 | Milton Road Campus |
6534 | 1814 | Abbeyhill | 55.955248 | -3.172216 | Near Abbey Mount |
7036 | 1815 | Sighthill – Edinburgh College | 55.926684 | -3.289481 | Sighthill Campus |
9335 | 1818 | Dynamic Earth | 55.951089 | -3.175725 | Outside Dynamic Earth, Holyrood Road |
215 | 1819 | Heriot Watt – Edinburgh Business School | 55.908823 | -3.320113 | Outside Edinburgh Business School |
1125 | 1820 | Heriot Watt – Student Accommodation | 55.908413 | -3.328784 | Outside Anna Macleod Halls |
2223 | 1821 | Drumsheugh Place | 55.951594 | -3.212354 | Corner with Drumsheugh Gardens |
2472 | 1822 | Edinburgh Park Station | 55.927383 | -3.307442 | Next to Rail and Tram stations |
2633 | 1823 | Boroughmuir | 55.940071 | -3.215336 | Off Gibson Terrace |
2142 | 1824 | Duke Street | 55.969012 | -3.167395 | Junction with Easter Road |
270 | 1857 | City Chambers Launch Station | 55.950222 | -3.190270 | Temporary station at City Chambers |
12197 | 1859 | Edinburgh Park Central | 55.931169 | -3.314414 | Between Lochside Court and Lochside Avenue |
9598 | 1860 | Ingliston Park & Ride | 55.938792 | -3.355556 | Next to Customer Service building |
6469 | 1864 | Borrowman Square | 55.982606 | -3.381455 | Near to Scotstoun Bus Terminus |
9252 | 1865 | Dalmeny Station | 55.986761 | -3.382427 | Next to bike shelter |
6130 | 1866 | The Loan | 55.989900 | -3.397773 | Next to East Coast Tyres |
8848 | 1868 | Forth Bridge Visitors Centre | 55.987743 | -3.403752 | Off Ferrymuir Gait |
8530 | 1869 | Hopetoun Road | 55.990182 | -3.404604 | Junction with Farquhar Terrace/Boness Road |
9287 | 1870 | Hawes Pier | 55.990530 | -3.385597 | Off Newhalls Road |
11105 | 1871 | Scotstoun House | 55.981107 | -3.394211 | ARUP Edinburgh |
10051 | 1874 | Tesco Ferrymuir | 55.983766 | -3.401352 | Outside supermarket |
10024 | 1877 | Port Edgar Marina | 55.992957 | -3.407156 | Next to Marina Shop and Restaurant |
855 | 2259 | Leith Walk North | 55.967918 | -3.173586 | Next to Allander House |
14980 | 2263 | Musselburgh Lidl | 55.943880 | -3.066754 | Musselborough North High Street opposite Harbour Road |
24262 | 2265 | Musselburgh Brunton Hall | 55.943961 | -3.058307 | Adjacent to the Brunton Theatre |
21035 | 2265 | Musselburgh Brunton Hall | 55.944009 | -3.058493 | Adjacent to the Brunton Theatre |
3184 | 2265 | Musselburgh Brunton Hall | 55.943961 | -3.058307 | Adjacent to the Brunton Theatre |
5806 | 2268 | Picady Place | 55.956535 | -3.186248 | Outside Omni Centre |
5963 | 2268 | Picardy Place | 55.956535 | -3.186248 | Outside Omni Centre |
Před pokračováním s analýzou jsem se rozhodl v celém datasetu ujednotit nesrovnalosti s ID čísly a názvy stanic tak, aby nebyly výsledné výpočty zkresleny chybně zadanými daty.
IN:
edbikes_df.loc[edbikes_df[‚start_station_name‘]==’Brunswick Place‘,[‚start_station_id‘]] = 1769
edbikes_df.loc[edbikes_df[‚end_station_name‘]==’Brunswick Place‘,[‚end_station_id‘]] = 1769
edbikes_df.loc[edbikes_df[‚start_station_name‘]==’Causewayside‘,[‚start_station_id‘]] = 1727
edbikes_df.loc[edbikes_df[‚end_station_name‘]==’Causewayside‘,[‚end_station_id‘]] = 1727
edbikes_df.loc[edbikes_df[‚start_station_name‘]==’Comely Bank Road‘,[‚start_station_id‘]] = 1763
edbikes_df.loc[edbikes_df[‚end_station_name‘]==’Comely Bank Road‘,[‚end_station_id‘]] = 1763
edbikes_df.loc[edbikes_df[‚start_station_name‘]==’Corstorphine Road‘,[‚start_station_id‘]] = 1747
edbikes_df.loc[edbikes_df[‚end_station_name‘]==’Corstorphine Road‘,[‚end_station_id‘]] = 1747
edbikes_df.loc[edbikes_df[‚start_station_name‘]==’Craigleith Road‘,[‚start_station_id‘]] = 1764
edbikes_df.loc[edbikes_df[‚end_station_name‘]==’Craigleith Road‘,[‚end_station_id‘]] = 1764
edbikes_df.loc[edbikes_df[‚start_station_name‘]==’Cramond Foreshore‘,[‚start_station_id‘]] = 1722
edbikes_df.loc[edbikes_df[‚end_station_name‘]==’Cramond Foreshore‘,[‚end_station_id‘]] = 1722
edbikes_df.loc[edbikes_df[‚start_station_name‘]==’Crichton Street‘,[‚start_station_id‘]] = 1017
.
.
.
a podobně dále
Vypsané stanice je možno zobrazit na mapě pomocí souřadnic a vizualizačního nástroje folium.
IN:
import folium
from folium import plugins
m_stations = folium.Map([55.948612, -3.200833],zoom_start=12) # Vytvoří mapu se středem v centru města.
for place, row in stations.iterrows():# Prochází řádky tabulky stations.
folium.Marker(row[[‚latitude‘, ‚longitude‘]].values.tolist(),
popup=folium.Popup(f“““Station name: {row[‚station_name‘]}“““),
icon=folium.Icon(icon=“home“, prefix=’fa‘)
).add_to(m_stations)
m_stations
OUT:
Identifikace aktivních a neaktivních stanic.
Aktivní budou stanice, které byly využity alespoň jednou za poslední rok záznamů.
IN:
# Vypíše IDs počátečních stanic využitých v posledním roce:
start_active = (edbikes_df.query(„started_at >= ‚2020-06-30‘ or ended_at >= ‚2020-06-30′“)[[‚start_station_id‘]]
.drop_duplicates(subset=’start_station_id‘).rename(columns={‚start_station_id‘:’station_id‘})
)
# Vypíše IDs konečných stanic využitých v posledním roce:
end_active = (edbikes_df.query(„started_at >= ‚2020-06-30‘ or ended_at >= ‚2020-06-30′“)[[‚end_station_id‘]]
.drop_duplicates(subset=’end_station_id‘).rename(columns={‚end_station_id‘:’station_id‘})
)
# Vytvoří soubor s IDs stanic využitých v posledním roce – spojení start_active a end-active:
active_stations = pd.concat([end_active,start_active]).drop_duplicates().sort_values(‚station_id‘)
# Pomocné data frame, ze kterých spojením vznikne soubor s ID stanic, jejich názvy a souřadnicemi:
df_s = stations.reset_index()[[‚station_id‘,’station_name‘,’latitude‘,’longitude‘]].set_index(‚station_id‘)
df_a = active_stations.set_index(‚station_id‘)
# Vznikne dataframe s aktivními stanicemi:
active_complete = df_a.join(df_s).dropna()
active_complete = active_complete.reset_index().drop_duplicates(subset = ‚station_id‘)
# Výpis prvních 6-ti řádků:
active_complete.head(6)
OUT:
station_id | station_name | latitude | longitude | |
1 | 183 | Waverley Bridge | 55.951344 | -3.191421 |
3 | 189 | City Chambers | 55.950109 | -3.190258 |
4 | 225 | Waverley Court | 55.951734 | -3.184179 |
5 | 246 | Royal Commonwealth Pool | 55.939000 | -3.173924 |
6 | 247 | Charlotte Square | 55.952335 | -3.207101 |
Aktivní stanice na mapě
IN:
m_active_stations = folium.Map([55.948612, -3.200833],zoom_start=12)
for place, row in active_complete.iterrows():
folium.Marker(row[[‚latitude‘, ‚longitude‘]].values.tolist(),
popup=folium.Popup(f“““Station name: {row[‚station_name‘]}“““),
icon=folium.Icon(icon=“home“, prefix=’fa‘)
).add_to(m_active_stations)
m_active_stations
OUT:
Úprava záladního data setu, aby obsahoval záznamy pouze aktivních stanic.
IN:
edbikes_df = edbikes_df[edbikes_df[‚start_station_id’and’end_station_id‘].isin(active_stations.values[:,0])]
Vytvoření tabulek se součty výpůjček a vratek.
IN:
df = edbikes_df.assign(cnt=0).groupby([‚start_station_id‘,’started_at‘]).count()[[‚cnt‘]]# Vytvoří tabulku s výpůjčkami.
df1 = df.groupby(‚start_station_id‘).sum().rename(columns={‚cnt‘:’sum_borrowings‘}).rename_axis(‚station_id‘)# Tabulka se součty výpůjček dle stanice.
df2 = edbikes_df.assign(cnt=0).groupby([‚end_station_id‘,’ended_at‘]).count()[[‚cnt‘]]# Vytvoří tabulku s vratkami.
df3 = df2.groupby(‚end_station_id‘).sum().rename(columns={‚cnt‘:’sum_returns‘}).rename_axis(‚station_id‘)# Tabulka se součty vratek dle stanice.
NZ výše vytvořené tabulky zobrazíme popisnou statistiku výpůjček.
IN:
(df1[[‚sum_borrowings‘]]/1019).describe()
OUT:
sum_borrowings
count 161.000000
mean 2.508299 # Průměrně 2,5 výpůjčky za den
std 2.941329
min0.000981 # Minimálně 1 výpůjčka za 1019 dnů.
25 % 0.219823
50 % 1.424926 # Median 1,4 za den.
75 % 3.866536
max 16.265947 # Maximálně 16 za den.
Popisná statistika vratek.
IN:
(df3[[‚sum_returns‘]]/1019).describe()
OUT:
sum_returns
count 112.000000
mean 3.605680 # Průměrně 3,6 vratek za den.
std 3.316777
min 0.005888 # Minimálně jedna vratka za 170 dnů.
25% 1.361138
50% 2.635427 # Medián 2,64 vratek za den.
75% 4.948724
max 16.345437 # Maximum 16 vratek za den.
Výpis nejfrekventovanějších stanic.
IN:
activity_df = df3.join(df1)# Vytvoření tabulky se součty výpůjček i vratek.
activity_df.fillna(value=0,axis=1,inplace=True) # Vyplnění chybějících hodnot nulou.
activity_df[‚frequency‘]=activity_df.sum_returns+activity_df.sum_borrowings# Přidá sloupec s frekvencí dle stanice.
activity_df[‚day_frequency‘]= (activity_df.frequency/1019).astype(int)# Přidá denní frekvenci.
(
stations.join(activity_df, on=’station_id‘)[[‚station_id‘,’station_name‘,’frequency‘,’day_frequency‘]].drop_duplicates(subset=’station_id‘).dropna()# Doplní jména stanic.
.sort_values(‚frequency‘,ascending=False).head(10) # Seřadí dle frekvence sestupně a zobrazí prvních 10.
)
OUT:
Určení kde kola spíše chybí a kde přebývají.
Nejprve do tabulky frekvencí přidáme procentuální podíl vratek.
IN:
activity_df = activity_df.assign(returns_percent=(activity_df.sum_returns/activity_df.frequency*100).round(2))
activity_df.head(10)
OUT:
Funkce, kterou aplikujeme tak, aby řekla, kde kola chybí, nebo se hromadí.
IN:
def usage (x):
if x > 50:
return „rather accumulates“
elif x < 50:
return „rather missing“
Výpis prvních deseti stanic, kde se kola hromadí.
IN:
activity_df.assign(use=activity_df[‚returns_percent‘].apply(usage)).sort_values(‚returns_percent‘,ascending=False).head(10)
OUT:
Výpis prvních deseti stanic, kde kola chybí.
IN:
activity_df.assign(use=activity_df[‚returns_percent‘].apply(usage)).sort_values(‚returns_percent‘).head(10)
OUT:
ZOBRAZENÍ VZDÁLENOSTÍ MEZI STANICEMI.
IN:
!pip install geopy
from geopy.distance import geodesic
from itertools import combinations
combs = list(combinations(stations.index,2)) # Vytvoří list se všemi možnými kombinacemi dvou stanic.
coords = np.array(list(combinations(stations[[‚latitude‘, ‚longitude‘]].values, 2))) # Vytvoří list se všemi možnými kombinacemi souřadnic.
coords = coords.reshape(coords.shape[0],4) # Vytvoří tabulku o 4 sloupcích a počtem řádků jako výše list.
coords = pd.DataFrame(coords) # Z coords array na dataframe.
# Funkce na výpočet vzdáleností:
def geodesic_vec(a, b, c, d):
rs = geodesic( (a, b), (c, d) ).kilometers
return rs
distances = list()
for data,row in coords.iterrows():
dists = np.round(geodesic_vec(row[0],row[1],row[2],row[3]),2)
distances.append(dists) # List se vzdálenostmi.
distances_df = pd.DataFrame(distances,pd.Index(combs,names=[‚City1′,’City2‘]),columns=[‚distance‘]) # Tabulka vzdáleností.
distances_df
OUT:
DOBA TRVÁNÍ VÝPŮJČEK
IN:
duration_df = edbikes_df[[‚duration‘]].assign(duration_min=round(edbikes_df.duration/60))
# Vytvoří dataset s dobou trvání výpůjček v minutách.
duration_df[[‚duration_min‘]].describe() # Deskriptivní statistika trvání výpůjček.
OUT:
Určení odlehlých hodnot před zobrazením histogramu doby výpůjček:
1) Určení interquartile range(IQR): (42 – 10) * 1,5 = 48
2) Přičtení IQR k 75 percentilu: 48 + 42 = 90
3) Všechny hodnoty nad 90 jsou odlehlé.
IN:
import matplotlib.pyplot as plt
plt.hist(duration_df[duration_df[‚duration_min‘].isin(range(5,91))][[‚duration_min‘]],bins = 6)
plt.xlabel(‚minutes‘)
plt.ylabel(‚count‘)
plt.title(‚DURATION IN MINUTES‘)
plt.grid(True)
plt.show()
OUT:
POČET VÝPŮJČEK ZA DEN
IN:
df4 = edbikes_df[[‚started_at‘]]# Vybere sloupec s začátky výpůjček.
df4 = df4.assign (started_at_day = pd.to_datetime(df4[‚started_at‘]).dt.date) # Přidá sloupec ve formátu data.
df4 = df4[[‚started_at_day‘]].assign(borrowings_count=0).groupby(‚started_at_day‘).count()# Přidápočtyvýpůjčekzaden.
df4.index = pd.to_datetime(df4.index) # Změní formát indexu.
df4
OUT:
VÝVOJ POPTÁVKY V ČASE
IN:
hour_df = edbikes_df.assign(started_at_day = pd.to_datetime(edbikes_df[‚started_at‘]).dt.date, hours = pd.to_datetime(edbikes_df[‚started_at‘]).dt.hour) # Přidá sloupce se dny a hodinami.
day_df = hour_df[hour_df[‚hours‘].isin(range(6,19))]# Vybere jen denní dobu.
day_df = day_df.groupby(‚started_at_day‘).count()[[‚hours‘]]# Vypočítá sumu výpůjček během denní doby.
night_df = hour_df[~hour_df[‚hours‘].isin(range(6,19))] # Vybere noční dobu.
night_df = night_df.groupby(‚started_at_day‘).count()[[‚hours‘]]# Vypočítá výpůjčky během noci.
fig,ax = plt.subplots(figsize=(19,4)) # Vytvoří prázdný graf.
ax.plot(day_df) # Přidá počty denních výpůjček.
ax.plot(night_df) # Přidá počty nočních výpůjček.
ax.legend(labels = [‚day‘,’night‘])
ax.grid()
plt.show()
OUT:
PŮJČUJÍ SI LIDÉ KOLA VÍCE O VÍKENDU NEBO PŘES TÝDEN?
IN:
weekday_df = (edbikes_df.assign(started_at_day = pd.to_datetime(edbikes_df[‚started_at‘])
.dt.date, day_of_week = pd.to_datetime(edbikes_df[‚started_at‘]).dt.dayofweek)# Přidá číslo dne v týdnu.
)
week_df = weekday_df[weekday_df[‚day_of_week‘].isin(range(0,5))] # Vybere jen pracovní dny.
week_df = week_df.groupby(‚started_at_day‘).count()[[‚started_at‘]] # Vypočítá výpůjčky během pracovních dnů.
weekend_df = weekday_df[~weekday_df[‚day_of_week‘].isin(range(0,5))]# Vybere jen víkendy.
weekend_df = weekend_df.groupby(‚started_at_day‘).count()[[‚started_at‘]] # Vypočítá výpůjčky během víkendu.
# Vytiskne výsledky:
print(‚Rental daily average working days: ‚)
print(‚\n‘,week_df.describe().iloc[1,0])
print(‚\n‘,’Rental daily average weekends: ‚)
print(‚\n‘,weekend_df.describe().iloc[1,0])
print(‚\n‘,’Daily borrowings during weekends are slightly higher than during working days.‘)
OUT:
Rental daily average working days: 387 Rental daily average weekends: 443 Daily borrowings during weekends are slightly higher than during working days.
NAČTENÍ DAT O POČASÍ V EDINBURGHU
IN:
edweather_df = pd.read_sql(‚edinburgh_weather‘,conn_string)
edweather_df[‚date‘] = pd.to_datetime(edweather_df[‚date‘])# Změní formát data.
edweather_df = edweather_df.set_index(‚date‘)# Datum do indexu.
edweather_df
OUT:
V tabulce dat o počasí nechybí žádné hodnoty.
IN:
edweather_df.isna().sum()
OUT:
Před dalším zpracováním dat o počasí změníme všechny údaje na numerické hodnoty.
IN:
# Funkce na přiřazení stupně viditelnosti:
def vis (x):
if x == ‚Excellent‘:
return 4
elif x == ‚Good‘:
return 3
elif x == ‚Average‘:
return 2
else:
return 1
# Změna na numerické hodnoty:
edweather_df = (edweather_df.assign(temp_celsius = edweather_df.temp.str.extract(„([-+]?\d+)“).astype(int),
feels_celsius = edweather_df.feels.str.extract(„([-+]?\d+)“).astype(int),
wind_kmh = edweather_df.wind.str.extract(„([-+]?\d+)“).astype(int),
gust_kmh = edweather_df.gust.str.extract(„([-+]?\d+)“).astype(int),
rain_mm = edweather_df.rain.str.extract(„([-+]?\d*\.\d+|[-+]?\d+)“).astype(float),
humidity_percent = edweather_df.humidity.str.extract(„([-+]?\d+)“).astype(int),
cloud_percent = edweather_df.cloud.str.extract(„([-+]?\d+)“).astype(int),
pressure_mb = edweather_df.pressure.str.extract(„([-+]?\d+)“).astype(int),
vis_point = edweather_df.vis.apply(vis))
)[[‚time‘,’temp_celsius‘, ‚feels_celsius‘, ‚wind_kmh‘, ‚gust_kmh‘, ‚rain_mm‘, ‚humidity_percent‘, ‚cloud_percent‘, ‚pressure_mb‘, ‚vis_point‘]]
edweather_df
OUT:
Deskriptivní statistika jednotlivých měření počasí. Výsledky:
IN:
edweather_df.describe()
OUT:
Úprava na denní průměrné hodnoty měření.
IN:
edweather_df = edweather_df.groupby(‚date‘).mean()
edweather_df
OUT:
Abychom otestovali korelace mezi počtem výpůjček a počasím, přidáme počty výpůjček ke každému dni.
IN:
edweather_df = df4.join(edweather_df,how=’inner‘)
edweather_df
OUT:
Korelační koeficienty. Ovlivňuje počasí počet výpůjček?
IN:
print(„Correlation between temperature and rentals: „)
print(‚\n‘,edweather_df[‚borrowings_count‘].corr(edweather_df[‚feels_celsius‘]).round(2))
OUT:
Correlation between temperature and rentals: 0.43
Efekt teploty během hlavní sezony je menší.
IN:
start_date = ‚2020-06-01‘
end_date = ‚2020-08-30‘
selection = (edweather_df.index>=start_date)&(edweather_df.index<=end_date)
season_df = edweather_df.loc[selection]
print(„Correlation between temperature and rentals from June to end August: „)
print(‚\n‘,season_df[‚borrowings_count‘].corr(season_df[‚feels_celsius‘]).round(2))
OUT:
Correlation between temperature and rentals from June to end August: 0.36
Ostatní vlivy počasí.
IN:
print(„Correlation between wind speed and rentals: „)
print(‚\n‘,edweather_df[‚borrowings_count‘].corr(edweather_df[‚wind_kmh‘]).round(2))
print(„Correlation between gusty wind and rentals: „)
print(‚\n‘,edweather_df[‚borrowings_count‘].corr(edweather_df[‚gust_kmh‘]).round(2))
print(„Correlation between rain fall and rentals: „)
print(‚\n‘,edweather_df[‚borrowings_count‘].corr(edweather_df[‚rain_mm‘]).round(2))
print(„Correlation between humidity and rentals: „)
print(‚\n‘,edweather_df[‚borrowings_count‘].corr(edweather_df[‚humidity_percent‘]).round(2))
print(„Correlation between clouds nad rentals: „)
print(‚\n‘,edweather_df[‚borrowings_count‘].corr(edweather_df[‚cloud_percent‘]).round(2))
print(„Correlation between atmospheric pressure and rentals: „)
print(‚\n‘,edweather_df[‚borrowings_count‘].corr(edweather_df[‚pressure_mb‘]).round(2))
print(„Correlation between visibility factor and rentals: „)
print(‚\n‘,edweather_df[‚borrowings_count‘].corr(edweather_df[‚vis_point‘]).round(2))
OUT:
Correlation between wind speed and rentals: -0.2
Correlation between gusty wind and rentals: -0.24
Correlation between rain fall and rentals: -0.06
Correlation between humidity and rentals: -0.21
Correlation between clouds nad rentals: -0.06
Correlation between atmospheric pressure and rentals: 0.11
Correlation between visibility factor and rentals: 0.06