docs/TODO - third_party/curl - Git at Google

                                   _   _ ____  _
                               ___| | | |  _ \| |
                              / __| | | | |_) | |
                             | (__| |_| |  _ <| |___
                              \___|\___/|_| \_\_____|

 TODO

  Things to do in project cURL. Please tell us what you think, contribute and
  send us patches that improve things! Also check the http://curl.haxx.se/dev
  web section for various technical development notes.

  LIBCURL

  * Introduce an interface to libcurl that allows applications to easier get to
    know what cookies that are received. Pushing interface that calls a
    callback on each received cookie? Querying interface that asks about
    existing cookies? We probably need both.

  * Make content encoding/decoding internally be made using a filter system.

  * Introduce another callback interface for upload/download that makes one
    less copy of data and thus a faster operation.
    [http://curl.haxx.se/dev/no_copy_callbacks.txt]

  * Run-time querying about library characterics. What protocols do this
    running libcurl support? What is the version number of the running libcurl
    (returning the well-defined version-#define). This could possibly be made
    by allowing curl_easy_getinfo() work with a NULL pointer for global info,
    but perhaps better would be to introduce a new curl_getinfo() (or similar)
    function for global info reading.

  * Add asynchronous name resolving (http://daniel.haxx.se/resolver/). This
    should be made to work on most of the supported platforms, or otherwise it
    isn't really interesting.

  * Data sharing. Tell which easy handles within a multi handle that should
    share cookies, connection cache, dns cache, ssl session cache.  Full
    suggestion found here: http://curl.haxx.se/dev/sharing.txt

  * Mutexes. By adding mutex callback support, the 'data sharing' mentioned
    above can be made between several easy handles running in different threads
    too. The actual mutex implementations will be left for the application to
    implement, libcurl will merely call 'getmutex' and 'leavemutex' callbacks.
    Part of the sharing suggestion at: http://curl.haxx.se/dev/sharing.txt

  * Set the SO_KEEPALIVE socket option to make libcurl notice and disconnect
    very long time idle connections.

  * Go through the code and verify that libcurl deals with big files >2GB and
    >4GB all over. Bug reports (and source reviews) indicate that it doesn't
    currently work properly.

  * Make the built-in progress meter use its own dedicated output stream, and
    make it possible to set it. Use stderr by default.

  * CURLOPT_MAXFILESIZE. Prevent downloads that are larger than the specified
    size. CURLE_FILESIZE_EXCEEDED would then be returned. Gautam Mani
    requested. That is, the download should even begin but be aborted
    immediately.

  * Allow the http_proxy (and other) environment variables to contain user and
    password as well in the style: http://proxyuser:proxypasswd@proxy:port
    Berend Reitsma suggested.

  LIBCURL - multi interface

  * Make sure we don't ever loop because of non-blocking sockets return
    EWOULDBLOCK or similar. This concerns the HTTP request sending (and
    especially regular HTTP POST), the FTP command sending etc.

  * Make uploads treated better. We need a way to tell libcurl we have data to
    write, as the current system expects us to upload data each time the socket
    is writable and there is no way to say that we want to upload data soon
    just not right now, without that aborting the upload.

  DOCUMENTATION

  * More and better

  FTP

  * FTP ASCII upload does not follow RFC959 section 3.1.1.1: "The sender
    converts the data from an internal character representation to the standard
    8-bit NVT-ASCII representation (see the Telnet specification).  The
    receiver will convert the data from the standard form to his own internal
    form."

  * An option to only download remote FTP files if they're newer than the local
    one is a good idea, and it would fit right into the same syntax as the
    already working http dito works. It of course requires that 'MDTM' works,
    and it isn't a standard FTP command.

  * Add FTPS support with SSL for the data connection too.  This should be made
    according to the specs written in draft-murray-auth-ftp-ssl-08.txt,
    "Securing FTP with TLS"

  HTTP

  * Pass a list of host name to libcurl to which we allow the user name and
    password to get sent to. Currently, it only get sent to the host name that
    the first URL uses (to prevent others from being able to read it), but this
    also prevents the authentication info from getting sent when following
    locations to legitimate other host names.

  * "Content-Encoding: compress/gzip/zlib" HTTP 1.1 clearly defines how to get
    and decode compressed documents. There is the zlib that is pretty good at
    decompressing stuff. This work was started in October 1999 but halted again
    since it proved more work than we thought. It is still a good idea to
    implement though. This requires the filter system mentioned above.

  * Authentication: NTLM. Support for that MS crap called NTLM
    authentication. MS proxies and servers sometime require that. Since that
    protocol is a proprietary one, it involves reverse engineering and network
    sniffing. This should however be a library-based functionality. There are a
    few different efforts "out there" to make open source HTTP clients support
    this and it should be possible to take advantage of other people's hard
    work. http://modntlm.sourceforge.net/ is one. There's a web page at
    http://www.innovation.ch/java/ntlm.html that contains detailed reverse-
    engineered info.

  * RFC2617 compliance, "Digest Access Authentication" A valid test page seem
    to exist at: http://hopf.math.nwu.edu/testpage/digest/ And some friendly
    person's server source code is available at
    http://hopf.math.nwu.edu/digestauth/index.html Then there's the Apache
    mod_digest source code too of course.  It seems as if Netscape doesn't
    support this, and not many servers do. Although this is a lot better
    authentication method than the more common "Basic". Basic sends the
    password in cleartext over the network, this "Digest" method uses a
    challange-response protocol which increases security quite a lot.

  * Pipelining. Sending multiple requests before the previous one(s) are done.
    This could possibly be implemented using the multi interface to queue
    requests and the response data.

  TELNET

  * Make TELNET work on windows98!

  * Reading input (to send to the remote server) on stdin is a crappy solution
    for library purposes. We need to invent a good way for the application to
    be able to provide the data to send.

  * Move the telnet support's network select() loop go away and merge the code
    into the main transfer loop. Until this is done, the multi interface won't
    work for telnet.

  SSL

  * If you really want to improve the SSL situation, you should probably have a
    look at SSL cafile loading as well - quick traces look to me like these are
    done on every request as well, when they should only be necessary once per
    ssl context (or once per handle). Even better would be to support the SSL
    CAdir option - instead of loading all of the root CA certs for every
    request, this option allows you to only read the CA chain that is actually
    required (into the cache)...

  * Add an interface to libcurl that enables "session IDs" to get
    exported/imported. Cris Bailiff said: "OpenSSL has functions which can
    serialise the current SSL state to a buffer of your choice, and
    recover/reset the state from such a buffer at a later date - this is used
    by mod_ssl for apache to implement and SSL session ID cache". This whole
    idea might become moot if we enable the 'data sharing' as mentioned in the
    LIBCURL label above.

  * OpenSSL supports a callback for customised verification of the peer
    certificate, but this doesn't seem to be exposed in the libcurl APIs. Could
    it be? There's so much that could be done if it were! (brought by Chris
    Clark)

  * Make curl's SSL layer option capable of using other free SSL libraries.
    Such as the Mozilla Security Services
    (http://www.mozilla.org/projects/security/pki/nss/) and GNUTLS
    (http://gnutls.hellug.gr/)

  LDAP

  * Look over the implementation. The looping will have to "go away" from the
    lib/ldap.c source file and get moved to the main network code so that the
    multi interface and friends will work for LDAP as well.

  CLIENT

  * Add an option that prevents cURL from overwiting existing local files. When
    used, and there already is an existing file with the target file name
    (either -O or -o), a number should be appended (and increased if already
    existing). So that index.html becomes first index.html.1 and then
    index.html.2 etc. Jeff Pohlmeyer suggested.

  * "curl ftp://site.com/*.txt"

  * Several URLs can be specified to get downloaded. We should be able to use
    the same syntax to specify several files to get uploaded (using the same
    persistant connection), using -T.

  * When the multi interface has been implemented and proved to work, the
    client could be told to use maximum N simultaneous transfers and then just
    make sure that happens. It should of course not make more than one
    connection to the same remote host.

  * Extending the capabilities of the multipart formposting. How about leaving
    the ';type=foo' syntax as it is and adding an extra tag (headers) which
    works like this: curl -F "coolfiles=@fil1.txt;headers=@fil1.hdr" where
    fil1.hdr contains extra headers like

      Content-Type: text/plain; charset=KOI8-R"
      Content-Transfer-Encoding: base64
      X-User-Comment: Please don't use browser specific HTML code

    which should overwrite the program reasonable defaults (plain/text,
    8bit...) (Idea brough to us by kromJx)

  TEST SUITE

  * Extend the test suite to include more protocols. The telnet could just do
    ftp or http operations (for which we have test servers).

  * Make the test suite work on more platforms. OpenBSD and Mac OS. Remove
    fork()s and it should become even more portable.

  * Introduce a test suite that tests libcurl better and more explicitly.
	_ _ ____ _
	___\| \| \| \| _ \\| \|
	/ __\| \| \| \| \|_) \| \|
	\| (__\| \|_\| \| _ <\| \|___
	\___\|\___/\|_\| \_\_____\|

	TODO

	Things to do in project cURL. Please tell us what you think, contribute and
	send us patches that improve things! Also check the http://curl.haxx.se/dev
	web section for various technical development notes.

	LIBCURL

	* Introduce an interface to libcurl that allows applications to easier get to
	know what cookies that are received. Pushing interface that calls a
	callback on each received cookie? Querying interface that asks about
	existing cookies? We probably need both.

	* Make content encoding/decoding internally be made using a filter system.

	* Introduce another callback interface for upload/download that makes one
	less copy of data and thus a faster operation.
	[http://curl.haxx.se/dev/no_copy_callbacks.txt]

	* Run-time querying about library characterics. What protocols do this
	running libcurl support? What is the version number of the running libcurl
	(returning the well-defined version-#define). This could possibly be made
	by allowing curl_easy_getinfo() work with a NULL pointer for global info,
	but perhaps better would be to introduce a new curl_getinfo() (or similar)
	function for global info reading.

	* Add asynchronous name resolving (http://daniel.haxx.se/resolver/). This
	should be made to work on most of the supported platforms, or otherwise it
	isn't really interesting.

	* Data sharing. Tell which easy handles within a multi handle that should
	share cookies, connection cache, dns cache, ssl session cache. Full
	suggestion found here: http://curl.haxx.se/dev/sharing.txt

	* Mutexes. By adding mutex callback support, the 'data sharing' mentioned
	above can be made between several easy handles running in different threads
	too. The actual mutex implementations will be left for the application to
	implement, libcurl will merely call 'getmutex' and 'leavemutex' callbacks.
	Part of the sharing suggestion at: http://curl.haxx.se/dev/sharing.txt

	* Set the SO_KEEPALIVE socket option to make libcurl notice and disconnect
	very long time idle connections.

	* Go through the code and verify that libcurl deals with big files >2GB and
	>4GB all over. Bug reports (and source reviews) indicate that it doesn't
	currently work properly.

	* Make the built-in progress meter use its own dedicated output stream, and
	make it possible to set it. Use stderr by default.

	* CURLOPT_MAXFILESIZE. Prevent downloads that are larger than the specified
	size. CURLE_FILESIZE_EXCEEDED would then be returned. Gautam Mani
	requested. That is, the download should even begin but be aborted
	immediately.

	* Allow the http_proxy (and other) environment variables to contain user and
	password as well in the style: http://proxyuser:proxypasswd@proxy:port
	Berend Reitsma suggested.

	LIBCURL - multi interface

	* Make sure we don't ever loop because of non-blocking sockets return
	EWOULDBLOCK or similar. This concerns the HTTP request sending (and
	especially regular HTTP POST), the FTP command sending etc.

	* Make uploads treated better. We need a way to tell libcurl we have data to
	write, as the current system expects us to upload data each time the socket
	is writable and there is no way to say that we want to upload data soon
	just not right now, without that aborting the upload.

	DOCUMENTATION

	* More and better

	FTP

	* FTP ASCII upload does not follow RFC959 section 3.1.1.1: "The sender
	converts the data from an internal character representation to the standard
	8-bit NVT-ASCII representation (see the Telnet specification). The
	receiver will convert the data from the standard form to his own internal
	form."

	* An option to only download remote FTP files if they're newer than the local
	one is a good idea, and it would fit right into the same syntax as the
	already working http dito works. It of course requires that 'MDTM' works,
	and it isn't a standard FTP command.

	* Add FTPS support with SSL for the data connection too. This should be made
	according to the specs written in draft-murray-auth-ftp-ssl-08.txt,
	"Securing FTP with TLS"

	HTTP

	* Pass a list of host name to libcurl to which we allow the user name and
	password to get sent to. Currently, it only get sent to the host name that
	the first URL uses (to prevent others from being able to read it), but this
	also prevents the authentication info from getting sent when following
	locations to legitimate other host names.

	* "Content-Encoding: compress/gzip/zlib" HTTP 1.1 clearly defines how to get
	and decode compressed documents. There is the zlib that is pretty good at
	decompressing stuff. This work was started in October 1999 but halted again
	since it proved more work than we thought. It is still a good idea to
	implement though. This requires the filter system mentioned above.

	* Authentication: NTLM. Support for that MS crap called NTLM
	authentication. MS proxies and servers sometime require that. Since that
	protocol is a proprietary one, it involves reverse engineering and network
	sniffing. This should however be a library-based functionality. There are a
	few different efforts "out there" to make open source HTTP clients support
	this and it should be possible to take advantage of other people's hard
	work. http://modntlm.sourceforge.net/ is one. There's a web page at
	http://www.innovation.ch/java/ntlm.html that contains detailed reverse-
	engineered info.

	* RFC2617 compliance, "Digest Access Authentication" A valid test page seem
	to exist at: http://hopf.math.nwu.edu/testpage/digest/ And some friendly
	person's server source code is available at
	http://hopf.math.nwu.edu/digestauth/index.html Then there's the Apache
	mod_digest source code too of course. It seems as if Netscape doesn't
	support this, and not many servers do. Although this is a lot better
	authentication method than the more common "Basic". Basic sends the
	password in cleartext over the network, this "Digest" method uses a
	challange-response protocol which increases security quite a lot.

	* Pipelining. Sending multiple requests before the previous one(s) are done.
	This could possibly be implemented using the multi interface to queue
	requests and the response data.

	TELNET

	* Make TELNET work on windows98!

	* Reading input (to send to the remote server) on stdin is a crappy solution
	for library purposes. We need to invent a good way for the application to
	be able to provide the data to send.

	* Move the telnet support's network select() loop go away and merge the code
	into the main transfer loop. Until this is done, the multi interface won't
	work for telnet.

	SSL

	* If you really want to improve the SSL situation, you should probably have a
	look at SSL cafile loading as well - quick traces look to me like these are
	done on every request as well, when they should only be necessary once per
	ssl context (or once per handle). Even better would be to support the SSL
	CAdir option - instead of loading all of the root CA certs for every
	request, this option allows you to only read the CA chain that is actually
	required (into the cache)...

	* Add an interface to libcurl that enables "session IDs" to get
	exported/imported. Cris Bailiff said: "OpenSSL has functions which can
	serialise the current SSL state to a buffer of your choice, and
	recover/reset the state from such a buffer at a later date - this is used
	by mod_ssl for apache to implement and SSL session ID cache". This whole
	idea might become moot if we enable the 'data sharing' as mentioned in the
	LIBCURL label above.

	* OpenSSL supports a callback for customised verification of the peer
	certificate, but this doesn't seem to be exposed in the libcurl APIs. Could
	it be? There's so much that could be done if it were! (brought by Chris
	Clark)

	* Make curl's SSL layer option capable of using other free SSL libraries.
	Such as the Mozilla Security Services
	(http://www.mozilla.org/projects/security/pki/nss/) and GNUTLS
	(http://gnutls.hellug.gr/)

	LDAP

	* Look over the implementation. The looping will have to "go away" from the
	lib/ldap.c source file and get moved to the main network code so that the
	multi interface and friends will work for LDAP as well.

	CLIENT

	* Add an option that prevents cURL from overwiting existing local files. When
	used, and there already is an existing file with the target file name
	(either -O or -o), a number should be appended (and increased if already
	existing). So that index.html becomes first index.html.1 and then
	index.html.2 etc. Jeff Pohlmeyer suggested.

	* "curl ftp://site.com/*.txt"

	* Several URLs can be specified to get downloaded. We should be able to use
	the same syntax to specify several files to get uploaded (using the same
	persistant connection), using -T.

	* When the multi interface has been implemented and proved to work, the
	client could be told to use maximum N simultaneous transfers and then just
	make sure that happens. It should of course not make more than one
	connection to the same remote host.

	* Extending the capabilities of the multipart formposting. How about leaving
	the ';type=foo' syntax as it is and adding an extra tag (headers) which
	works like this: curl -F "coolfiles=@fil1.txt;headers=@fil1.hdr" where
	fil1.hdr contains extra headers like

	Content-Type: text/plain; charset=KOI8-R"
	Content-Transfer-Encoding: base64
	X-User-Comment: Please don't use browser specific HTML code

	which should overwrite the program reasonable defaults (plain/text,
	8bit...) (Idea brough to us by kromJx)

	TEST SUITE

	* Extend the test suite to include more protocols. The telnet could just do
	ftp or http operations (for which we have test servers).

	* Make the test suite work on more platforms. OpenBSD and Mac OS. Remove
	fork()s and it should become even more portable.

	* Introduce a test suite that tests libcurl better and more explicitly.