Just because a web crawler is complex in implementation, doesn't mean it has to be complex to use. Only offer what is really necessary, use sensible defaults for the rest. That will get you 80% of use cases, and then rely on the other 20% being more willing for have a deeper understanding.
-
他们为什么要理解这一点?取决于预期的使用情况,但我会假设大多数的使用情况下爬行一个完整的网站,所以只需要域。
-
Gert G
's suggestion of a slider with extending folder structure was a good one. This doesn't have to be dynamic with the site in question, just an illustration of what it means.
-
Forget exposing file extensions, instead offer common types of file with icons, possibly even grouping them (e.g. all common image types, jpg, png, gif, go into one 'images' type). Only give raw file extension settings under an advanced config section, those that need it will understand it.
-
我真的不明白他们为什么需要理解这一点?当然,这是爬虫的工作。