代码之家  ›  专栏  ›  技术社区  ›  Ashvin solanki

Youtube视频下载(Android/Java)

  •  0
  • Ashvin solanki  · 技术社区  · 6 年前

    注: 原始问题已删除

    最初的问题是 https://stackoverflow.com/questions/15240011/get-the-download-url-for-youtube-video-android-java/15240012#15240012

    这里的答案是过时的,不工作,所以我会张贴一个新的问题,并回答我自己

    旧代码

    new YouTubePageStreamUriGetter().execute("https://www.youtube.com/watch?v=4GuqB1BQVr4");
    
    class Meta {
        public String num;
        public String type;
        public String ext;
    
        Meta(String num, String ext, String type) {
            this.num = num;
            this.ext = ext;
            this.type = type;
        }
    }
    
    class Video {
        public String ext = "";
        public String type = "";
        public String url = "";
    
        Video(String ext, String type, String url) {
            this.ext = ext;
            this.type = type;
            this.url = url;
        }
    }
    
    public ArrayList<Video> getStreamingUrisFromYouTubePage(String ytUrl)
            throws IOException {
        if (ytUrl == null) {
            return null;
        }
    
        // Remove any query params in query string after the watch?v=<vid> in
        // e.g.
        // http://www.youtube.com/watch?v=0RUPACpf8Vs&feature=youtube_gdata_player
        int andIdx = ytUrl.indexOf('&');
        if (andIdx >= 0) {
            ytUrl = ytUrl.substring(0, andIdx);
        }
    
        // Get the HTML response
        String userAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0.1)";
        HttpClient client = new DefaultHttpClient();
        client.getParams().setParameter(CoreProtocolPNames.USER_AGENT,
                userAgent);
        HttpGet request = new HttpGet(ytUrl);
        HttpResponse response = client.execute(request);
        String html = "";
        InputStream in = response.getEntity().getContent();
        BufferedReader reader = new BufferedReader(new InputStreamReader(in));
        StringBuilder str = new StringBuilder();
        String line = null;
        while ((line = reader.readLine()) != null) {
            str.append(line.replace("\\u0026", "&"));
        }
        in.close();
        html = str.toString();
    
        // Parse the HTML response and extract the streaming URIs
        if (html.contains("verify-age-thumb")) {
            CLog.w("YouTube is asking for age verification. We can't handle that sorry.");
            return null;
        }
    
        if (html.contains("das_captcha")) {
            CLog.w("Captcha found, please try with different IP address.");
            return null;
        }
    
        Pattern p = Pattern.compile("stream_map\": \"(.*?)?\"");
        // Pattern p = Pattern.compile("/stream_map=(.[^&]*?)\"/");
        Matcher m = p.matcher(html);
        List<String> matches = new ArrayList<String>();
        while (m.find()) {
            matches.add(m.group());
        }
    
        if (matches.size() != 1) {
            CLog.w("Found zero or too many stream maps.");
            return null;
        }
    
        String urls[] = matches.get(0).split(",");
        HashMap<String, String> foundArray = new HashMap<String, String>();
        for (String ppUrl : urls) {
            String url = URLDecoder.decode(ppUrl, "UTF-8");
    
            Pattern p1 = Pattern.compile("itag=([0-9]+?)[&]");
            Matcher m1 = p1.matcher(url);
            String itag = null;
            if (m1.find()) {
                itag = m1.group(1);
            }
    
            Pattern p2 = Pattern.compile("sig=(.*?)[&]");
            Matcher m2 = p2.matcher(url);
            String sig = null;
            if (m2.find()) {
                sig = m2.group(1);
            }
    
            Pattern p3 = Pattern.compile("url=(.*?)[&]");
            Matcher m3 = p3.matcher(ppUrl);
            String um = null;
            if (m3.find()) {
                um = m3.group(1);
            }
    
            if (itag != null && sig != null && um != null) {
                foundArray.put(itag, URLDecoder.decode(um, "UTF-8") + "&"
                        + "signature=" + sig);
            }
        }
    
        if (foundArray.size() == 0) {
            CLog.w("Couldn't find any URLs and corresponding signatures");
            return null;
        }
    
        HashMap<String, Meta> typeMap = new HashMap<String, Meta>();
        typeMap.put("13", new Meta("13", "3GP", "Low Quality - 176x144"));
        typeMap.put("17", new Meta("17", "3GP", "Medium Quality - 176x144"));
        typeMap.put("36", new Meta("36", "3GP", "High Quality - 320x240"));
        typeMap.put("5", new Meta("5", "FLV", "Low Quality - 400x226"));
        typeMap.put("6", new Meta("6", "FLV", "Medium Quality - 640x360"));
        typeMap.put("34", new Meta("34", "FLV", "Medium Quality - 640x360"));
        typeMap.put("35", new Meta("35", "FLV", "High Quality - 854x480"));
        typeMap.put("43", new Meta("43", "WEBM", "Low Quality - 640x360"));
        typeMap.put("44", new Meta("44", "WEBM", "Medium Quality - 854x480"));
        typeMap.put("45", new Meta("45", "WEBM", "High Quality - 1280x720"));
        typeMap.put("18", new Meta("18", "MP4", "Medium Quality - 480x360"));
        typeMap.put("22", new Meta("22", "MP4", "High Quality - 1280x720"));
        typeMap.put("37", new Meta("37", "MP4", "High Quality - 1920x1080"));
        typeMap.put("33", new Meta("38", "MP4", "High Quality - 4096x230"));
    
        ArrayList<Video> videos = new ArrayList<ARViewer.Video>();
    
        for (String format : typeMap.keySet()) {
            Meta meta = typeMap.get(format);
    
            if (foundArray.containsKey(format)) {
                Video newVideo = new Video(meta.ext, meta.type,
                        foundArray.get(format));
                videos.add(newVideo);
                CLog.d("YouTube Video streaming details: ext:" + newVideo.ext
                        + ", type:" + newVideo.type + ", url:" + newVideo.url);
            }
        }
    
        return videos;
    }
    
    private class YouTubePageStreamUriGetter extends
            AsyncTask<String, String, String> {
        ProgressDialog progressDialog;
    
        @Override
        protected void onPreExecute() {
            super.onPreExecute();
            progressDialog = ProgressDialog.show(ARViewer.this, "",
                    "Connecting to YouTube...", true);
        }
    
        @Override
        protected String doInBackground(String... params) {
            String url = params[0];
            try {
                ArrayList<Video> videos = getStreamingUrisFromYouTubePage(url);
                if (videos != null && !videos.isEmpty()) {
                    String retVidUrl = null;
                    for (Video video : videos) {
                        if (video.ext.toLowerCase().contains("mp4")
                                && video.type.toLowerCase().contains("medium")) {
                            retVidUrl = video.url;
                            break;
                        }
                    }
                    if (retVidUrl == null) {
                        for (Video video : videos) {
                            if (video.ext.toLowerCase().contains("3gp")
                                    && video.type.toLowerCase().contains(
                                            "medium")) {
                                retVidUrl = video.url;
                                break;
    
                            }
                        }
                    }
                    if (retVidUrl == null) {
    
                        for (Video video : videos) {
                            if (video.ext.toLowerCase().contains("mp4")
                                    && video.type.toLowerCase().contains("low")) {
                                retVidUrl = video.url;
                                break;
    
                            }
                        }
                    }
                    if (retVidUrl == null) {
                        for (Video video : videos) {
                            if (video.ext.toLowerCase().contains("3gp")
                                    && video.type.toLowerCase().contains("low")) {
                                retVidUrl = video.url;
                                break;
                            }
                        }
                    }
    
                    return retVidUrl;
                }
            } catch (Exception e) {
                CLog.e("Couldn't get YouTube streaming URL", e);
            }
            CLog.w("Couldn't get stream URI for " + url);
            return null;
        }
    
        @Override
        protected void onPostExecute(String streamingUrl) {
            super.onPostExecute(streamingUrl);
            progressDialog.dismiss();
            if (streamingUrl != null) {
                             /* Do what ever you want with streamUrl */
            }
        }
    }
    

    此代码无效

    1 回复  |  直到 6 年前
        1
  •  2
  •   Ashvin solanki    5 年前

    编辑3

    您可以使用Lib: https://github.com/HaarigerHarald/android-youtubeExtractor

    String youtubeLink = "http://youtube.com/watch?v=xxxx";
    
    new YouTubeExtractor(this) {
    @Override
    public void onExtractionComplete(SparseArray<YtFile> ytFiles, VideoMeta vMeta) {
        if (ytFiles != null) {
            int itag = 22;
        String downloadUrl = ytFiles.get(itag).getUrl();
        }
    }
    }.extract(youtubeLink, true, true);
    

    他们使用以下方法解密签名:

    private boolean decipherSignature(final SparseArray<String> encSignatures) throws IOException {
        // Assume the functions don't change that much
        if (decipherFunctionName == null || decipherFunctions == null) {
            String decipherFunctUrl = "https://s.ytimg.com/yts/jsbin/" + decipherJsFileName;
    
            BufferedReader reader = null;
            String javascriptFile;
            URL url = new URL(decipherFunctUrl);
            HttpURLConnection urlConnection = (HttpURLConnection) url.openConnection();
            urlConnection.setRequestProperty("User-Agent", USER_AGENT);
            try {
                reader = new BufferedReader(new InputStreamReader(urlConnection.getInputStream()));
                StringBuilder sb = new StringBuilder("");
                String line;
                while ((line = reader.readLine()) != null) {
                    sb.append(line);
                    sb.append(" ");
                }
                javascriptFile = sb.toString();
            } finally {
                if (reader != null)
                    reader.close();
                urlConnection.disconnect();
            }
    
            if (LOGGING)
                Log.d(LOG_TAG, "Decipher FunctURL: " + decipherFunctUrl);
            Matcher mat = patSignatureDecFunction.matcher(javascriptFile);
            if (mat.find()) {
                decipherFunctionName = mat.group(1);
                if (LOGGING)
                    Log.d(LOG_TAG, "Decipher Functname: " + decipherFunctionName);
    
                Pattern patMainVariable = Pattern.compile("(var |\\s|,|;)" + decipherFunctionName.replace("$", "\\$") +
                        "(=function\\((.{1,3})\\)\\{)");
    
                String mainDecipherFunct;
    
                mat = patMainVariable.matcher(javascriptFile);
                if (mat.find()) {
                    mainDecipherFunct = "var " + decipherFunctionName + mat.group(2);
                } else {
                    Pattern patMainFunction = Pattern.compile("function " + decipherFunctionName.replace("$", "\\$") +
                            "(\\((.{1,3})\\)\\{)");
                    mat = patMainFunction.matcher(javascriptFile);
                    if (!mat.find())
                        return false;
                    mainDecipherFunct = "function " + decipherFunctionName + mat.group(2);
                }
    
                int startIndex = mat.end();
    
                for (int braces = 1, i = startIndex; i < javascriptFile.length(); i++) {
                    if (braces == 0 && startIndex + 5 < i) {
                        mainDecipherFunct += javascriptFile.substring(startIndex, i) + ";";
                        break;
                    }
                    if (javascriptFile.charAt(i) == '{')
                        braces++;
                    else if (javascriptFile.charAt(i) == '}')
                        braces--;
                }
                decipherFunctions = mainDecipherFunct;
                // Search the main function for extra functions and variables
                // needed for deciphering
                // Search for variables
                mat = patVariableFunction.matcher(mainDecipherFunct);
                while (mat.find()) {
                    String variableDef = "var " + mat.group(2) + "={";
                    if (decipherFunctions.contains(variableDef)) {
                        continue;
                    }
                    startIndex = javascriptFile.indexOf(variableDef) + variableDef.length();
                    for (int braces = 1, i = startIndex; i < javascriptFile.length(); i++) {
                        if (braces == 0) {
                            decipherFunctions += variableDef + javascriptFile.substring(startIndex, i) + ";";
                            break;
                        }
                        if (javascriptFile.charAt(i) == '{')
                            braces++;
                        else if (javascriptFile.charAt(i) == '}')
                            braces--;
                    }
                }
                // Search for functions
                mat = patFunction.matcher(mainDecipherFunct);
                while (mat.find()) {
                    String functionDef = "function " + mat.group(2) + "(";
                    if (decipherFunctions.contains(functionDef)) {
                        continue;
                    }
                    startIndex = javascriptFile.indexOf(functionDef) + functionDef.length();
                    for (int braces = 0, i = startIndex; i < javascriptFile.length(); i++) {
                        if (braces == 0 && startIndex + 5 < i) {
                            decipherFunctions += functionDef + javascriptFile.substring(startIndex, i) + ";";
                            break;
                        }
                        if (javascriptFile.charAt(i) == '{')
                            braces++;
                        else if (javascriptFile.charAt(i) == '}')
                            braces--;
                    }
                }
    
                if (LOGGING)
                    Log.d(LOG_TAG, "Decipher Function: " + decipherFunctions);
                decipherViaWebView(encSignatures);
                if (CACHING) {
                    writeDeciperFunctToChache();
                }
            } else {
                return false;
            }
        } else {
            decipherViaWebView(encSignatures);
        }
        return true;
    }
    

    现在利用这个图书馆 失去音频,所以我使用 MediaMuxer 对于 Murging Audio 和视频输出

    https://stackoverflow.com/a/15240012/9909365

     Pattern p2 = Pattern.compile("sig=(.*?)[&]");
            Matcher m2 = p2.matcher(url);
            String sig = null;
            if (m2.find()) {
                sig = m2.group(1);
            }
    

    截至2016年11月,边缘有点粗糙,但是 显示基本原理。今天的url编码流地图 在冒号后面没有空格(最好是可选的)和 sig “已更改为” signature "

    signature&s 在许多视频的网址

    这里编辑了答案

    private static final HashMap<String, Meta> typeMap = new HashMap<String, Meta>();
    

    initTypeMap();先打电话

    class Meta {
        public String num;
        public String type;
        public String ext;
    
        Meta(String num, String ext, String type) {
            this.num = num;
            this.ext = ext;
            this.type = type;
        }
    }
    
    class Video {
        public String ext = "";
        public String type = "";
        public String url = "";
    
        Video(String ext, String type, String url) {
            this.ext = ext;
            this.type = type;
            this.url = url;
        }
    }
    
    public ArrayList<Video> getStreamingUrisFromYouTubePage(String ytUrl)
            throws IOException {
        if (ytUrl == null) {
            return null;
        }
    
        // Remove any query params in query string after the watch?v=<vid> in
        // e.g.
        // http://www.youtube.com/watch?v=0RUPACpf8Vs&feature=youtube_gdata_player
        int andIdx = ytUrl.indexOf('&');
        if (andIdx >= 0) {
            ytUrl = ytUrl.substring(0, andIdx);
        }
    
        // Get the HTML response
        /* String userAgent = "Mozilla/5.0 (Windows NT 6.1; WOW64; rv:8.0.1)";*/
       /* HttpClient client = new DefaultHttpClient();
        client.getParams().setParameter(CoreProtocolPNames.USER_AGENT,
                userAgent);
        HttpGet request = new HttpGet(ytUrl);
        HttpResponse response = client.execute(request);*/
        String html = "";
        HttpsURLConnection c = (HttpsURLConnection) new URL(ytUrl).openConnection();
        c.setRequestMethod("GET");
        c.setDoOutput(true);
        c.connect();
        InputStream in = c.getInputStream();
        BufferedReader reader = new BufferedReader(new InputStreamReader(in));
        StringBuilder str = new StringBuilder();
        String line = null;
        while ((line = reader.readLine()) != null) {
            str.append(line.replace("\\u0026", "&"));
        }
        in.close();
        html = str.toString();
    
        // Parse the HTML response and extract the streaming URIs
        if (html.contains("verify-age-thumb")) {
            Log.e("Downloader", "YouTube is asking for age verification. We can't handle that sorry.");
            return null;
        }
    
        if (html.contains("das_captcha")) {
            Log.e("Downloader", "Captcha found, please try with different IP address.");
            return null;
        }
    
        Pattern p = Pattern.compile("stream_map\":\"(.*?)?\"");
        // Pattern p = Pattern.compile("/stream_map=(.[^&]*?)\"/");
        Matcher m = p.matcher(html);
        List<String> matches = new ArrayList<String>();
        while (m.find()) {
            matches.add(m.group());
        }
    
        if (matches.size() != 1) {
            Log.e("Downloader", "Found zero or too many stream maps.");
            return null;
        }
    
        String urls[] = matches.get(0).split(",");
        HashMap<String, String> foundArray = new HashMap<String, String>();
        for (String ppUrl : urls) {
            String url = URLDecoder.decode(ppUrl, "UTF-8");
            Log.e("URL","URL : "+url);
    
            Pattern p1 = Pattern.compile("itag=([0-9]+?)[&]");
            Matcher m1 = p1.matcher(url);
            String itag = null;
            if (m1.find()) {
                itag = m1.group(1);
            }
    
            Pattern p2 = Pattern.compile("signature=(.*?)[&]");
            Matcher m2 = p2.matcher(url);
            String sig = null;
            if (m2.find()) {
                sig = m2.group(1);
            } else {
                Pattern p23 = Pattern.compile("signature&s=(.*?)[&]");
                Matcher m23 = p23.matcher(url);
                if (m23.find()) {
                    sig = m23.group(1);
                }
            }
    
            Pattern p3 = Pattern.compile("url=(.*?)[&]");
            Matcher m3 = p3.matcher(ppUrl);
            String um = null;
            if (m3.find()) {
                um = m3.group(1);
            }
    
            if (itag != null && sig != null && um != null) {
                Log.e("foundArray","Adding Value");
                foundArray.put(itag, URLDecoder.decode(um, "UTF-8") + "&"
                        + "signature=" + sig);
            }
        }
        Log.e("foundArray","Size : "+foundArray.size());
        if (foundArray.size() == 0) {
            Log.e("Downloader", "Couldn't find any URLs and corresponding signatures");
            return null;
        }
    
    
        ArrayList<Video> videos = new ArrayList<Video>();
    
        for (String format : typeMap.keySet()) {
            Meta meta = typeMap.get(format);
    
            if (foundArray.containsKey(format)) {
                Video newVideo = new Video(meta.ext, meta.type,
                        foundArray.get(format));
                videos.add(newVideo);
                Log.d("Downloader", "YouTube Video streaming details: ext:" + newVideo.ext
                        + ", type:" + newVideo.type + ", url:" + newVideo.url);
            }
        }
    
        return videos;
    }
    
    private class YouTubePageStreamUriGetter extends AsyncTask<String, String, ArrayList<Video>> {
        ProgressDialog progressDialog;
    
        @Override
        protected void onPreExecute() {
            super.onPreExecute();
            progressDialog = ProgressDialog.show(webViewActivity.this, "",
                    "Connecting to YouTube...", true);
        }
    
        @Override
        protected ArrayList<Video> doInBackground(String... params) {
            ArrayList<Video> fVideos = new ArrayList<>();
            String url = params[0];
            try {
                ArrayList<Video> videos = getStreamingUrisFromYouTubePage(url);
                /*                Log.e("Downloader","Size of Video : "+videos.size());*/
                if (videos != null && !videos.isEmpty()) {
                    for (Video video : videos)
                    {
                        Log.e("Downloader", "ext : " + video.ext);
                        if (video.ext.toLowerCase().contains("mp4") || video.ext.toLowerCase().contains("3gp") || video.ext.toLowerCase().contains("flv") || video.ext.toLowerCase().contains("webm")) {
                            ext = video.ext.toLowerCase();
                            fVideos.add(new Video(video.ext,video.type,video.url));
                        }
                    }
    
    
                    return fVideos;
                }
            } catch (Exception e) {
                e.printStackTrace();
                Log.e("Downloader", "Couldn't get YouTube streaming URL", e);
            }
            Log.e("Downloader", "Couldn't get stream URI for " + url);
            return null;
        }
    
        @Override
        protected void onPostExecute(ArrayList<Video> streamingUrl) {
            super.onPostExecute(streamingUrl);
            progressDialog.dismiss();
            if (streamingUrl != null) {
                if (!streamingUrl.isEmpty()) {
                    //Log.e("Steaming Url", "Value : " + streamingUrl);
    
                    for (int i = 0; i < streamingUrl.size(); i++) {
                        Video fX = streamingUrl.get(i);
                        Log.e("Founded Video", "URL : " + fX.url);
                        Log.e("Founded Video", "TYPE : " + fX.type);
                        Log.e("Founded Video", "EXT : " + fX.ext);
                    }
                    //new ProgressBack().execute(new String[]{streamingUrl, filename + "." + ext});
                }
            }
        }
    }
    public void initTypeMap()
    {
        typeMap.put("13", new Meta("13", "3GP", "Low Quality - 176x144"));
        typeMap.put("17", new Meta("17", "3GP", "Medium Quality - 176x144"));
        typeMap.put("36", new Meta("36", "3GP", "High Quality - 320x240"));
        typeMap.put("5", new Meta("5", "FLV", "Low Quality - 400x226"));
        typeMap.put("6", new Meta("6", "FLV", "Medium Quality - 640x360"));
        typeMap.put("34", new Meta("34", "FLV", "Medium Quality - 640x360"));
        typeMap.put("35", new Meta("35", "FLV", "High Quality - 854x480"));
        typeMap.put("43", new Meta("43", "WEBM", "Low Quality - 640x360"));
        typeMap.put("44", new Meta("44", "WEBM", "Medium Quality - 854x480"));
        typeMap.put("45", new Meta("45", "WEBM", "High Quality - 1280x720"));
        typeMap.put("18", new Meta("18", "MP4", "Medium Quality - 480x360"));
        typeMap.put("22", new Meta("22", "MP4", "High Quality - 1280x720"));
        typeMap.put("37", new Meta("37", "MP4", "High Quality - 1920x1080"));
        typeMap.put("33", new Meta("38", "MP4", "High Quality - 4096x230"));
    }
    

    编辑2:

    同一原产地政策

    https://en.wikipedia.org/wiki/Same-origin_policy

    https://en.wikipedia.org/wiki/Cross-origin_resource_sharing

    problem of Same-origin policy. Essentially, you cannot download this file from www.youtube.com because they are different domains. A workaround of this problem is [CORS][1]. 
    

    https://superuser.com/questions/773719/how-do-all-of-these-save-video-from-youtube-services-work/773998#773998

    url_encoded_fmt_stream_map // traditional: contains video and audio stream
    adaptive_fmts              // DASH: contains video or audio stream
    

    每一个都是一个逗号分隔的数组,我称之为“流对象”。每个“流对象”将包含如下值

    url  // direct HTTP link to a video
    itag // code specifying the quality
    s    // signature, security measure to counter downloading
    

    YouTube的视频至少有3个安全级别

    unsecured // as expected, you can download these with just the unencoded URL
    s         // see below
    RTMPE     // uses "rtmpe://" protocol, no known method for these
    

    RTMPE视频通常用于正式的全长电影,并使用SWF验证类型2进行保护。这种情况从2011年就开始了,现在还没有进行反向工程。

    “s”类视频是最难下载的。你会在VEVO视频上看到这些。它们以签名开始,例如

    AA5D05FA7771AD4868BA4C977C3DEAAC620DE020E.0F42820F42978A1F8EAFCDAC4EF507DB5

    function mo(a) {
      a = a.split("");
      a = lo.rw(a, 1);
      a = lo.rw(a, 32);
      a = lo.IC(a, 1);
      a = lo.wS(a, 77);
      a = lo.IC(a, 3);
      a = lo.wS(a, 77);
      a = lo.IC(a, 3);
      a = lo.wS(a, 44);
      return a.join("")
    }
    

    这个函数是动态的,通常每天都在变化。为了增加难度,该函数被托管在一个URL上,比如

    http://s.ytimg.com/yts/jsbin/html5player-en_US-vflycBCEX.js

    这就引入了同源政策的问题。实际上,您不能从下载此文件www.youtube.com 因为它们是不同的领域。解决这个问题的方法是CORS。有了CORS,s.ytimg.com可以添加这个标题

    Access-Control-Allow-Origin: http://www.youtube.com
    

    它允许JavaScript从www.youtube.com. 他们当然不会这么做。此解决方案的解决方案是使用CORS代理。这是一个代理,它用以下头响应所有请求

    Access-Control-Allow-Origin: *
    

    因此,现在您已经代理了您的JS文件,并使用该函数对签名进行置乱,您可以在querystring中使用该函数来下载视频。