If you want to dump all links in a page to a text file, including hidden ones, you can use lynx.

This may be useful for debugging, or to bookmark all links in a webpage of your interest.

Also if you want to download all specific files in a webpage, lets say all mp3 or jpg files in a index page.

Well, lets see it in action.

lynx --dump http://www.go2linux.org | grep http

The output is:

   1. http://www.go2linux.org/rss.xml
   2. http://www.go2linux.org/
   3. http://www.go2linux.org/
   4. http://www.go2linux.org/all-linux-posts
   5. http://www.go2linux.org/voip-posts
   6. http://www.go2linux.org/web-design-posts
   7. http://www.go2linux.org/all-gadgets-posts
   8. http://www.go2linux.org/all-my-personal-posts
   9. http://www.go2linux.org/archive
  10. http://www.go2linux.org/rss.xml
  11. http://www.go2linux.org/debian-ubuntu-1
  12. http://www.go2linux.org/all-about-security-1
  13. http://www.go2linux.org/for-the-sysadmin
  14. http://www.go2linux.org/paypal-donate-go2linux-726
  15. http://posterous.com/people/5AvACVG2bnJn
  16. http://www.go2linux.org/contribute-go2linux-747
  17. http://www.go2linux.org/contact-me-748
  18. http://www.go2linux.org/glossary
  19. http://g.garron.me/
  20. http://www.go2linux.org/linux/2010/09/share-files-python-m-simplehttpserver-775
  21. http://www.go2linux.org/taxonomy/term/36
  22. http://www.go2linux.org/taxonomy/term/25
  23. http://www.go2linux.org/linux/2010/09/share-files-python-m-simplehttpserver-775
  24. http://www.go2linux.org/node/775#disqus_thread
  25. http://www.go2linux.org/linux/2010/09/linux-tip-when-you-forget-use-sudo-774

If you want the same output without the numbering line, you can use

lynx --dump http://www.go2linux.org | awk '/http/{print $2}'

The output will be:

http://www.go2linux.org/rss.xml
http://www.go2linux.org/
http://www.go2linux.org/
http://www.go2linux.org/all-linux-posts
http://www.go2linux.org/voip-posts
http://www.go2linux.org/web-design-posts
http://www.go2linux.org/all-gadgets-posts
http://www.go2linux.org/all-my-personal-posts
http://www.go2linux.org/archive
http://www.go2linux.org/rss.xml
http://www.go2linux.org/debian-ubuntu-1
http://www.go2linux.org/all-about-security-1
http://www.go2linux.org/for-the-sysadmin
http://www.go2linux.org/paypal-donate-go2linux-726
http://posterous.com/people/5AvACVG2bnJn
http://www.go2linux.org/contribute-go2linux-747
http://www.go2linux.org/contact-me-748
http://www.go2linux.org/glossary
http://g.garron.me/
http://www.go2linux.org/linux/2010/09/share-files-python-m-simplehttpserver-775
http://www.go2linux.org/taxonomy/term/36
http://www.go2linux.org/taxonomy/term/25
http://www.go2linux.org/linux/2010/09/share-files-python-m-simplehttpserver-775
http://www.go2linux.org/node/775#disqus_thread
http://www.go2linux.org/linux/2010/09/linux-tip-when-you-forget-use-sudo-774
http://www.go2linux.org/taxonomy/term/25

If you want to download any specific type of file, just grep for that file this output, and then use a bash for loop and wget to download them.

It could be something like this:

lynx --dump http://somesite.com/page.html | awk '/http/{print $2}' | grep jpg > /tmp/file.txt

Then use a simple for loop to download the files.

for i in $( cat /tmp/file.txt ); do wget $i; done

This is just an example of some uses this may have, there are other ways to download specific files from a site, maybe easier than this one.