File size: 9,428 Bytes
0715869
 
 
 
 
 
 
fafef56
 
 
 
 
 
 
 
 
 
 
 
 
 
6b8743b
fafef56
6b8743b
fafef56
6b8743b
 
fafef56
6b8743b
fafef56
6b8743b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
b849068
 
6b8743b
b849068
 
 
 
 
 
 
 
6b8743b
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
c1cfd25
6b8743b
 
 
 
 
 
 
 
c1cfd25
 
6b8743b
c1cfd25
6b8743b
 
 
 
 
 
 
 
 
 
 
c1cfd25
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
.. _quickstart:

Quickstart
==========

This guide will walk you through the basic usage of pytube.

Let's get started with some examples.

Downloading a Video
-------------------

Downloading a video from YouTube with pytube is incredibly easy.

Begin by importing the YouTube class::

    >>> from pytube import YouTube

Now, let's try to download a video. For this example, let's take something
popular like PSY - Gangnam Style::

    >>> yt = YouTube('https://www.youtube.com/watch?v=9bZkp7q19f0')

Now, we have a :class:`YouTube <pytube.YouTube>` object called ``yt``.

The pytube API makes all information intuitive to access. For example, this is
how you would get the video's title::

    >>> yt.title
    PSY - GANGNAM STYLE(강남스타일) M/V

And this would be how you would get the thumbnail url::

    >>> yt.thumbnail_url
    'https://i.ytimg.com/vi/mTOYClXhJD0/default.jpg'

Neat, right? Next let's see the available media formats::

    >>> yt.streams.all()
    [<Stream: itag="22" mime_type="video/mp4" res="720p" fps="30fps" vcodec="avc1.64001F" acodec="mp4a.40.2">,
    <Stream: itag="43" mime_type="video/webm" res="360p" fps="30fps" vcodec="vp8.0" acodec="vorbis">,
    <Stream: itag="18" mime_type="video/mp4" res="360p" fps="30fps" vcodec="avc1.42001E" acodec="mp4a.40.2">,
    <Stream: itag="36" mime_type="video/3gpp" res="240p" fps="30fps" vcodec="mp4v.20.3" acodec="mp4a.40.2">,
    <Stream: itag="17" mime_type="video/3gpp" res="144p" fps="30fps" vcodec="mp4v.20.3" acodec="mp4a.40.2">,
    <Stream: itag="137" mime_type="video/mp4" res="1080p" fps="30fps" vcodec="avc1.640028">,
    <Stream: itag="136" mime_type="video/mp4" res="720p" fps="30fps" vcodec="avc1.4d401f">,
    <Stream: itag="135" mime_type="video/mp4" res="480p" fps="30fps" vcodec="avc1.4d401f">,
    <Stream: itag="134" mime_type="video/mp4" res="360p" fps="30fps" vcodec="avc1.4d401e">,
    <Stream: itag="133" mime_type="video/mp4" res="240p" fps="30fps" vcodec="avc1.4d4015">,
    <Stream: itag="160" mime_type="video/mp4" res="144p" fps="30fps" vcodec="avc1.4d400c">,
    <Stream: itag="140" mime_type="audio/mp4" abr="128kbps" acodec="mp4a.40.2">,
    <Stream: itag="171" mime_type="audio/webm" abr="128kbps" acodec="vorbis">]

Let's say we want to get the first stream::

    >>> stream = yt.streams.first()
    >>> stream
    <Stream: itag="22" mime_type="video/mp4" res="720p" fps="30fps" vcodec="avc1.64001F" acodec="mp4a.40.2">

And to download it to the current working directory::

    >>> stream.download()

You can also specify a destination path::

    >>> stream.download('/tmp')


Working with Streams
====================

The next section will explore the various options available for working with media
streams, but before we can dive in, we need to review a new-ish streaming technique
adopted by YouTube.

DASH vs Progressive Streams
---------------------------

Begin by running the following::

    >>> yt.streams.all()
    [<Stream: itag="22" mime_type="video/mp4" res="720p" fps="30fps" vcodec="avc1.64001F" acodec="mp4a.40.2">,
    <Stream: itag="43" mime_type="video/webm" res="360p" fps="30fps" vcodec="vp8.0" acodec="vorbis">,
    <Stream: itag="18" mime_type="video/mp4" res="360p" fps="30fps" vcodec="avc1.42001E" acodec="mp4a.40.2">,
    <Stream: itag="36" mime_type="video/3gpp" res="240p" fps="30fps" vcodec="mp4v.20.3" acodec="mp4a.40.2">,
    <Stream: itag="17" mime_type="video/3gpp" res="144p" fps="30fps" vcodec="mp4v.20.3" acodec="mp4a.40.2">,
    <Stream: itag="137" mime_type="video/mp4" res="1080p" fps="30fps" vcodec="avc1.640028">,
    <Stream: itag="136" mime_type="video/mp4" res="720p" fps="30fps" vcodec="avc1.4d401f">,
    <Stream: itag="135" mime_type="video/mp4" res="480p" fps="30fps" vcodec="avc1.4d401f">,
    <Stream: itag="134" mime_type="video/mp4" res="360p" fps="30fps" vcodec="avc1.4d401e">,
    <Stream: itag="133" mime_type="video/mp4" res="240p" fps="30fps" vcodec="avc1.4d4015">,
    <Stream: itag="160" mime_type="video/mp4" res="144p" fps="30fps" vcodec="avc1.4d400c">,
    <Stream: itag="140" mime_type="audio/mp4" abr="128kbps" acodec="mp4a.40.2">,
    <Stream: itag="171" mime_type="audio/webm" abr="128kbps" acodec="vorbis">]


You may notice that some streams listed have both a video codec and audio
codec, while others have just video or just audio, this is a result of YouTube
supporting a streaming technique called Dynamic Adaptive Streaming over HTTP
(DASH).

In the context of pytube, the implications are for the highest quality streams;
you now need to download both the audio and video tracks and then post-process
them with software like FFmpeg to merge them.

The legacy streams that contain the audio and video in a single file (referred
to as "progressive download") are still available, but only for resolutions
720p and below.

To only view these progressive download streams::

    >>> yt.streams.filter(progressive=True).all()
    [<Stream: itag="22" mime_type="video/mp4" res="720p" fps="30fps" vcodec="avc1.64001F" acodec="mp4a.40.2">,
    <Stream: itag="43" mime_type="video/webm" res="360p" fps="30fps" vcodec="vp8.0" acodec="vorbis">,
    <Stream: itag="18" mime_type="video/mp4" res="360p" fps="30fps" vcodec="avc1.42001E" acodec="mp4a.40.2">,
    <Stream: itag="36" mime_type="video/3gpp" res="240p" fps="30fps" vcodec="mp4v.20.3" acodec="mp4a.40.2">,
    <Stream: itag="17" mime_type="video/3gpp" res="144p" fps="30fps" vcodec="mp4v.20.3" acodec="mp4a.40.2">]

Conversely, if you only want to see the DASH streams (also referred to as
"adaptive") you can do::

    >>> yt.streams.filter(adaptive=True).all()
    [<Stream: itag="137" mime_type="video/mp4" res="1080p" fps="30fps" vcodec="avc1.640028">,
    <Stream: itag="136" mime_type="video/mp4" res="720p" fps="30fps" vcodec="avc1.4d401f">,
    <Stream: itag="135" mime_type="video/mp4" res="480p" fps="30fps" vcodec="avc1.4d401f">,
    <Stream: itag="134" mime_type="video/mp4" res="360p" fps="30fps" vcodec="avc1.4d401e">,
    <Stream: itag="133" mime_type="video/mp4" res="240p" fps="30fps" vcodec="avc1.4d4015">,
    <Stream: itag="160" mime_type="video/mp4" res="144p" fps="30fps" vcodec="avc1.4d400c">,
    <Stream: itag="140" mime_type="audio/mp4" abr="128kbps" acodec="mp4a.40.2">,
    <Stream: itag="171" mime_type="audio/webm" abr="128kbps" acodec="vorbis">]

Pytube allows you to filter on every property available (see
:py:meth:`pytube.StreamQuery.filter` for a complete list of filter options),
let's take a look at some common examples:

Query audio only Streams
------------------------

To query the streams that contain only the audio track::

    >>> yt.streams.filter(only_audio=True).all()
    [<Stream: itag="140" mime_type="audio/mp4" abr="128kbps" acodec="mp4a.40.2">,
    <Stream: itag="171" mime_type="audio/webm" abr="128kbps" acodec="vorbis">]

Query MPEG-4 Streams
--------------------

To query only streams in the MPEG-4 format::

    >>> yt.streams.filter(file_extension='mp4').all()
    [<Stream: itag="22" mime_type="video/mp4" res="720p" fps="30fps" vcodec="avc1.64001F" acodec="mp4a.40.2">,
    <Stream: itag="18" mime_type="video/mp4" res="360p" fps="30fps" vcodec="avc1.42001E" acodec="mp4a.40.2">,
    <Stream: itag="137" mime_type="video/mp4" res="1080p" fps="30fps" vcodec="avc1.640028">,
    <Stream: itag="136" mime_type="video/mp4" res="720p" fps="30fps" vcodec="avc1.4d401f">,
    <Stream: itag="135" mime_type="video/mp4" res="480p" fps="30fps" vcodec="avc1.4d401f">,
    <Stream: itag="134" mime_type="video/mp4" res="360p" fps="30fps" vcodec="avc1.4d401e">,
    <Stream: itag="133" mime_type="video/mp4" res="240p" fps="30fps" vcodec="avc1.4d4015">,
    <Stream: itag="160" mime_type="video/mp4" res="144p" fps="30fps" vcodec="avc1.4d400c">,
    <Stream: itag="140" mime_type="audio/mp4" abr="128kbps" acodec="mp4a.40.2">]

Get Streams by itag
-------------------

To get a stream by a specific itag::

    >>> yt.streams.get_by_itag('22')
    <Stream: itag="22" mime_type="video/mp4" res="720p" fps="30fps" vcodec="avc1.64001F" acodec="mp4a.40.2">

Subtitle/Caption Tracks
=======================

Pytube exposes the caption tracks in much the same way as querying the media
streams. Let's begin by switching to a video that contains them::

    >>> yt = YouTube('https://youtube.com/watch?v=XJGiS83eQLk')
    >>> yt.captions.all()
    [<Caption lang="Arabic" code="ar">,
    <Caption lang="English (auto-generated)" code="en">,
    <Caption lang="English" code="en">,
    <Caption lang="English (United Kingdom)" code="en-GB">,
    <Caption lang="German" code="de">,
    <Caption lang="Greek" code="el">,
    <Caption lang="Indonesian" code="id">,
    <Caption lang="Sinhala" code="si">,
    <Caption lang="Spanish" code="es">,
    <Caption lang="Turkish" code="tr">]

Now let's checkout the english captions::

    >>> caption = yt.captions.get_by_language_code('en')

Great, now let's see how YouTube formats them::

    >>> caption.xml_captions
    '<?xml version="1.0" encoding="utf-8" ?><transcript><text start="0" dur="5.541">well i&amp;#39...'

Oh, this isn't very easy to work with, let's convert them to the srt format::

    >>> print(caption.generate_srt_captions())
    1
    000:000:00,000 --> 000:000:05,541
    well i'm just an editor and i dont know what to type

    2
    000:000:05,541 --> 000:000:12,321
    not new to video. In fact, most films before 1930 were silent and used captions with video

    ...