final-draft

2021-05-24 16:16:42 +12:00
parent 456315a6f5
commit f133567e76
2 changed files with 69 additions and 48 deletions
--- a/dataframe.png
+++ b/dataframe.png
--- a/index.html
+++ b/index.html
@@ -115,6 +115,7 @@
 					- Access API
 					- Channel upload playlist
 					- Video statistics
 					- `pandas` dataframe
 				</section>
 				<section data-markdown>
 					### 4. Get YouTube video statistics
@@ -132,8 +133,9 @@
 					```
 				</section>
-				<section>
+				<section data-markdown>
-					<pre><code data-line-numbers="3|5-16|17-18|18-29|30-32"># tubestates/youtube_api.py
+					```python [|3|5-16|17-18|20-29|30-32]
 					# tubestates/youtube_api.py
 					upload_playlist_ID = channel_data['upload_playlist_ID']
@@ -165,7 +167,10 @@ while True:
 					df = pd.json_normalize(video_response, 'items')
 					return df
-					</code></pre>
+				</section>
 				<section data-markdown>
 					### Video statistics
 					![](dataframe.png)
 				</section>
 				<section data-markdown>
 					## How does TubeStats work?
@@ -202,7 +207,7 @@ return df
 				</section>	
 				<section data-markdown>
 					## 6. Testing
-					```python [|16-20]
+					```python [|15-20]
 					# tests/tests_youtube_api.py
 					from tubestats.youtube_api import create_api, YouTubeAPI
 					from tests.test_settings import set_channel_ID_test_case
@@ -344,8 +349,13 @@ return df
 				</section>
 				<section data-markdown>
 					## Somethings I would like to discuss
 					- DataFrame and memory
 					- Error handling
 					- Async?
 				</section>
 				<section data-markdown>
 					### DataFrame immutability and memory?
 					```python []
 					df = self.df
 					df = df[['snippet.publishedAt',
 					    'snippet.title',
@@ -355,16 +365,27 @@ return df
 					df = df.fillna(0)
        # changing dtypes
 					df = df.astype({'statistics.viewCount': 'int',
 						...
 					    'statistics.commentCount': 'int',})
        # applying natural log to view count as data is tail heavy
 					df['statistics.viewCount_NLOG'] = df['statistics.viewCount'].apply(lambda x : np.log(x))
 					df = df.sort_values(by='snippet.publishedAt_REFORMATED', ascending=True) 
        return DataFrame)	
 				</section>
 				<section data-markdown>
 					## What did I learn
 					- Project based learning
 					- 'minimal viable product'
 				</section>
 				<section data-markdown>
 					## Conclusion
 					- Analysing consistency
 					- YouTube Data API --> Heroku
 					- Share your work!
 				</section>
 				<section data-markdown>
 					## Acknowledgements
 					- Menno
 			</div>
 		</div>