您发布的Uri(
https://sede.educacion.gob.es/publiventa/catalogo.action?cod=E
)使用Javascript开关显示菜单内容。
当您连接到该Uri(不单击菜单链接)时,该站点将显示该页面的三个不同版本。
1) 带有关闭菜单和建议的新版本的页面
2) 带有关闭菜单和搜索引擎字段的页面
3) 具有打开菜单和菜单内容选择的页面
此切换基于记录当前会话的内部过程。除非单击菜单链接(连接到事件侦听器),否则Javascript过程将以不同的状态显示页面。
我看了它一眼;这些脚本相当长(一个完整的多用途库),我没有时间对其进行解析(可能您可以这样做),以找出事件侦听器传递的参数。
但是,三态版本切换是恒定的。
我的意思是,您可以三次调用该页面,保留Cookie容器:第三次连接到它时,它将流式处理整个菜单内容及其链接。
如果您三次请求同一页面,第三次Html页面将
包含所有
Materias de educación
链接
public async void SomeMethodAsync()
{
string HtmlPage = await GetHttpStream([URI]);
HtmlPage = await GetHttpStream([URI]);
HtmlPage = await GetHttpStream([URI]);
}
或多或少,这就是我用来获取该页面的内容:
CookieContainer CookieJar = new CookieContainer();
public async Task<string> GetHttpStream(Uri HtmlPage)
{
HttpWebRequest httpRequest;
string Payload = string.Empty;
httpRequest = WebRequest.CreateHttp(HtmlPage);
try
{
httpRequest.CookieContainer = CookieJar;
httpRequest.KeepAlive = true;
httpRequest.ConnectionGroupName = Guid.NewGuid().ToString();
httpRequest.AllowAutoRedirect = true;
httpRequest.AutomaticDecompression = DecompressionMethods.GZip | DecompressionMethods.Deflate;
httpRequest.ServicePoint.MaxIdleTime = 30000;
httpRequest.ServicePoint.Expect100Continue = false;
httpRequest.UserAgent = "Mozilla/5.0 (Windows NT 10; Win64; x64; rv:56.0) Gecko/20100101 Firefox/56.0";
httpRequest.Accept = "ext/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8";
httpRequest.Headers.Add(HttpRequestHeader.AcceptLanguage, "es-ES,es;q=0.8,en-US;q=0.5,en;q=0.3");
httpRequest.Headers.Add(HttpRequestHeader.AcceptEncoding, "gzip, deflate;q=0.8");
httpRequest.Headers.Add(HttpRequestHeader.CacheControl, "no-cache");
using (HttpWebResponse httpResponse = (HttpWebResponse)await httpRequest.GetResponseAsync())
{
Stream ResponseStream = httpResponse.GetResponseStream();
if (httpResponse.StatusCode == HttpStatusCode.OK)
{
try
{
//ResponseStream.Position = 0;
Encoding encoding = Encoding.GetEncoding(httpResponse.CharacterSet);
using (MemoryStream _memStream = new MemoryStream())
{
if (httpResponse.ContentEncoding.Contains("gzip"))
{
using (GZipStream _gzipStream = new GZipStream(ResponseStream, System.IO.Compression.CompressionMode.Decompress))
{
_gzipStream.CopyTo(_memStream);
};
}
else if (httpResponse.ContentEncoding.Contains("deflate"))
{
using (DeflateStream _deflStream = new DeflateStream(ResponseStream, System.IO.Compression.CompressionMode.Decompress))
{
_deflStream.CopyTo(_memStream);
};
}
else
{
ResponseStream.CopyTo(_memStream);
}
_memStream.Position = 0;
using (StreamReader _reader = new StreamReader(_memStream, encoding))
{
Payload = _reader.ReadToEnd().Trim();
};
};
}
catch (Exception)
{
Payload = string.Empty;
}
}
}
}
catch (WebException exW)
{
if (exW.Response != null)
{
//Handle WebException
}
}
catch (System.Exception exS)
{
//Handle System.Exception
}
CookieJar = httpRequest.CookieContainer;
return Payload;
}