|
|
|
Here's a form to compare how GET and POST work. Place in the same directory as the expanded test case attached above.
<html> <body> <h3>GET</h3> <form name="input" action="echo.jsp?query=string" method="GET"> get_param: <input type="text" name="get_param" value="get_value"/> <input type="submit" value="Submit"/> </form> <h3>POST</h3> <form name="input" action="echo.jsp?query=string" method="POST"> Username: <input type="text" name="post_param" value="post_value"/> <input type="submit" value="Submit"/> </form> </body> </html> Our APIs are mostly right. I think our POST handling doesn't conform to the way POST actions are sent by HTML forms (run the example above). POST requests that have query parameters defined in the URL aren't removed from the query string and placed in the body like we do it. They remain in the query string.
This is how it should work in OpenLaszlo if you define a query string in the src like "http://somehost.com/?queryParam=queryValue". How about introducing a new attribute called keepquerystring. If set to true, the query string remains in the src line. Also if the query string is set via the querystring attribute or setQueryString API. By default, this is false and works the way it does currently. What do you think? Should we send out a proposal to laszlo-dev? Maybe this is just an out-and-out bug because Henry and I did not understand that HTML POST can have a query string too? If that is the case, our scheme to make setQueryString and setQueryParam two API's to the same database is wrong and just needs to be fixed. No new API is needed.
Can you point to some HTML spec that says this is how things are supposed to be? Also/alternatively, what do you _really_ need here? Would it be sufficient to just say that in a POST the src URL is untrammeled, and that if it has query params they are passed on through (rather than stripped and/or coerced into the post body as apparently happens now)? I think that would be the simplest fix and the least likely to break anything. I believe this is a bug, but changing the behavior of setQueryString will break backwards compatibility.
I like the idea: "just say that in a POST the src URL is untrammeled, and that if it has query params they are passed on through" Pablo: will that work? As far as I can tell, there is nothing in the HTTP spec that prevents someone from sending a query string in the request path. For example:
POST /login.jsp?qsParam=qsValue HTTP/1.1 Host: www.mysite.com User-Agent: Mozilla/4.0 Content-Length: 27 Content-Type: application/x-www-form-urlencoded userid=joe&password=guessme The content is a POST x-www-form-urlencoded body and "?qsParam=qsValue" is the query string of the URI. The Java API description for HttpServletRequest.getQueryString(): Returns the query string that is contained in the request URL after the path. Query parameters can be retrieved in the query string or in the content body (if POST and the content type is x-www-form-urlencoded). To answer your second question: yes, it would be sufficient to leave the src of the POST alone. Query parameters that should appear in the body should be set via the setQueryParam API. Query strings should always appear on the URL. [Sarah: _I_ really like that I can use 'trammel' in a bug report. :)]
I think we have to be careful about falling into the trap of thinking that HTTP is the only 'transport', now that we have data providers. Hence my suggestion that we just preserve the URL and not change the param API's. There may be other transports that do _not_ have 2 channels for sending parameters. So if Pablo is happy with an untrammeled src URL, I think that is the way we should go. Pablo confirms that even this suggestion will break backwards-compatibility.
David Temkin and I brainstormed another approach. We could use the dataprovider API to create an alternate HTTP dataprovider which has the APIs done the "right" way, where setQueryParam goes in the POST body and setQueryString goes on the URL as you would expect. Then the default dataset dataprovider would be the old one for now, but people could use the new one if they need this feature. At some future time, when we are comfortable breaking backward-compatibility a bit, we could change the default. Did we ever add that thing we talked about where the default dataprovider for datasets could be set globally? I have to ask: will my suggestion break an existing program, or just _theoretically_ break compatibility? These API's have been a never-ending source of bugs because I don't think anyone really knows what they were supposed to do. Henry and I only recently 'fixed' them so that setQueryString, setQueryParam, and setSrc (which may include a query component if an http: URL) operate on the same 'database'. Prior to that, mixing these calls gave pretty much random results.
OTOH, since this 'two-channel' query is really only a property of HTTP transport, I like the idea that it would be moved out of the generic data path and into the specific data provider. Makes me think that the GET/POST distinction belongs on the dataprovider too. I don't know the answer to the default dataprovider question. Henry? It will break compatibility for anyone using POST and those APIs. I suppose that is "theoretical" -- it would have broken Laszlo Mail, I believe, not sure about Webtop, which, of course, we're changing anyhow.
Yes, there is a global declared for the default data provider, it's set to an instance of LzHTTPDataProvider right now
var defaultdataprovider = httpdataprovider; LzDataset uses that global as the default dataprovider class LzDataset .. { var dataprovider = defaultdataprovider; } Pablo's suggestion for a flag to change the behavior, "keepQueryString" , might be the most pragmatic thing to do.
We could eventually deprecate the current behavior, change to the new behavior, and leave a flag "doItTheOldWay" or something ? The methods that need to be modified to preserver the query string are mostly in LzDataset, rather than the dataprovider. I don't think we need to modify the dataprovider, because it is passed the query string
and the query params as separate args now, so it is mostly up to the Dataset as to what to put in those two parameters. Leaving the URL untrammeled could only break an application that puts at least part of its query in an URL and expects that by using POST the URL part of the query will be removed and merged into the body of the POST. Wacky, but as I said, I don't think anyone ever defined how the three ways of specifying query parameters were to interact. Personally, I would be very happy if we could eliminate at least 1 of the 3 API's that are used to adjust query parameters.
I need to ask: Do we want there to always be _TWO_ channels of query in a data request? Is that the API we want? And all data providers have to support two channels? How would we support 2 channels using HTTP/GET as transport? Or is the bug really, what Pablo first described, that he just wants a way to uniqify the URL used to make requests for some extra 'out-of-band' data (which is used for load balancing or logging) for the HTTP data provider only? If the latter, there is already yet another channel available to HTTP that (hopefully) is already passed through untrammeled, and that is the anchor, or fragment (delimited by `#`). Maybe all Pablo needs to do is put his out-of-band info there? [Given the short timeframe, I am trying to come up with the solution that is going to perturb things the least.] The point is not the number of channels, but rather that we need to be able to supply the whole URL for any URL-based transport. I don't understand your comment about the anchor -- is there a separate API that appends the anchor part of the URL?
An URL has a number of components. The first component is the protocol (what comes before the ':'). What comes after that is dependent on the protocol. For HTTP, there is a host (after the '//' before the '/') which may include a port, login and password, then a path (between the first and last '/'), then a file (between the last '/' and any '?' or '#' or end), then an optional query (between '?' and any '#' or end) and finally an anchor or fragment (between '#' and the end).
For other protocols, everything after the '//' is defined by the protocol, so we can't depend on there even being any such thing as a query in other protocols. My point is that to bring some sanity to the 3 API's that we have for setting the parameters of a query, Henry and I declared that setSrc (when the protocol is HTTP and there is a query part to the URL), setQueryString, and setQueryParam, are just 3 API's for setting pairs of key/value that eventually get sent via the underlying transport when a request is made. In the case of GET, they are all encoded into the query part of the URL. In the case of POST, they all get encoded into the post body. This isn't the way it used to be, but we had a lot more bugs where people were surprised that this isn't what happened than people who expected the 3 API's to be separate channels of communication. I'm still not clear about 'we need to supply the whole URL' because in Pablo's initial comment it seems that he really just needs to encode an additional marker in the URL (which is why I am talking about data channels and out-of-band data). The query API's should not touch the fragment part of an HTTP URL, so that is why I suggest that perhaps Pablo could just use that, avoiding the issue of the query parameters altogether. I'm also happy to accept that setSrc doesn't interact with the other parameter API's. [In the case of HTTP/GET, there will have to be some sort of interaction, perhaps just that the parameters are appended to any query in the URL.] And I have to say again, given how ill-defined these API's were in 3.x, I don't see how anything we choose to do at this point could be deemed incompatible.
Here's my proposal: We change setSrc so that it _does_not_ modify the URL, and does not parse any query string. We change the implementation of query parameters so that for HTTP/GET they are appended to any query already in the src URL and for HTTP/POST they are sent in the body (but any query in the src URL is left as is). We update the documentation to note that setSrc does _not_ affect the query parameters, and the query Parameter API's do _not_ affect the src URL. (It is an implementation detail that for HTTP/GET the query parameters are merged into the src URL by appending to any src URL query.) Finally setQueryParam and setQueryString remain two API's for managing the query parameters, but setQueryString should be deprecated because it's name would seem to imply that it will overwrite the query part of an HTTP URL, but it does not. As a side note, my proposed 'anchor' work-around (where you could set the anchor portion of the request URL to get a similar effect) will apparently not work. Inspecting the HTTPDataProvider code, the request URL is being treated as a string and the query blindly appended to that string, which means you will get a royal mess if you try to use an anchor in the source URL. The URL really should be kept as an LzURL until it is ready to be sent, it should not be manipulated as a string.
Really for 4.0.9 (Sno-cone), but we have no such project (yet)
------------------------------------------------------------------------
r7872 | hqm | 2008-01-23 13:50:26 -0500 (Wed, 23 Jan 2008) | 53 lines Changed paths: M /openlaszlo/branches/wafflecone/WEB-INF/lps/lfc/data/LzDataset.lzs M /openlaszlo/branches/wafflecone/WEB-INF/lps/lfc/data/LzHTTPDataProvider.lzs M /openlaszlo/branches/wafflecone/WEB-INF/lps/lfc/data/LzParam.lzs M /openlaszlo/branches/wafflecone/WEB-INF/lps/lfc/kernel/dhtml/LzHTTPLoader.js M /openlaszlo/branches/wafflecone/WEB-INF/lps/lfc/kernel/swf/LzHTTPLoader.as M /openlaszlo/branches/wafflecone/WEB-INF/lps/lfc/kernel/swf/LzLoadQueue.as M /openlaszlo/branches/wafflecone/WEB-INF/lps/server/src/org/openlaszlo/data/HTTPDataSource.java M /openlaszlo/branches/wafflecone/test/lfc/data/alldata.lzx M /openlaszlo/branches/wafflecone/test/lfc/data/testsetheaders-solo.lzx M /openlaszlo/branches/wafflecone/test/lfc/data/testsetheaders.lzx Change 20080123-hqm-4 by hqm@DADDY_THNKPAD67 on 2008-01-23 09:36:16 EST in /cygdrive/c/users/hqm/openlaszlo/wafflecone for http://svn.openlaszlo.org/openlaszlo/branches/wafflecone Summary: updated: preserve url query string with post data requests New Features: Bugs Fixed: Technical Reviewer: pkang QA Reviewer: ptw Doc Reviewer: Documentation: The URL which is specified as the 'src' attribute of a dataset, such as <dataset src="http://foo.bar.com/baz.php?myarg=1" /> will be preserved in POST operations, both SOLO and proxied. All data which is set via the setQueryString setQueryParam setQueryParams will be stored in a distinct table, and will only be merged with the src url in the case of a GET request. For POST requests, the src full URL with its original query string will be sent, and all other param data will be sent www-form-encoded in the POST body. Note: raw post data via the "lzpostbody" arg turns out not to work in SOLO mode, and has never actually fully worked. Release Notes: Details: Tests: test/lfc/data/alldata.lzx DHTML/SWF amazon calendar proxied/solo DHTML/SWF ------------------------------------------------------------------------ Open again, needs to be set to resolved, fixed.
Test files that were originally contained in queryparam.tgz. They should all go in the same diretory. Please name them as indicated.
==== echo.jsp ==== <%@ page import="java.util.*" %> <% String queryString = request.getQueryString(); if (queryString == null) queryString = ""; response.setContentType("text/xml"); out.println("<request>"); out.println("<method>" + request.getMethod() + "</method>"); out.println("<queryString>" + queryString + "</queryString>"); out.println("<parameters>"); Map map = request.getParameterMap(); Iterator iter = map.keySet().iterator(); while (iter.hasNext()) { String key = (String)iter.next(); String[] values = (String[])map.get(key); out.println("<parameter>" + key + "=" + values[0] + "</parameter>"); } out.println("</parameters>"); out.println("</request>"); %> ==== echo.html ==== <html> <body> <h3>GET</h3> <form name="input" action="echo.jsp?query=string" method="GET"> get_param: <input type="text" name="get_param" value="get_value"/> <input type="submit" value="Submit"/> </form> <h3>POST</h3> <form name="input" action="echo.jsp?query=string" method="POST"> Username: <input type="text" name="post_param" value="post_value"/> <input type="submit" value="Submit"/> </form> </body> </html> ==== main.lzx ==== <canvas debug="true" proxied="false"> <debug x="20" y="20" width="700" height="430"/> <!-- How can I get query params on a POST request to stay on the querystring and not the POST body? --> <dataset name="postDset" querytype="POST" src="http:echo.jsp" ondata="canvas.display(this)"> <handler name="oninit"> this.setQueryString("xxx=yyy"); this.doRequest(); </handler> </dataset> <dataset name="getDset" querytype="GET" src="http:echo.jsp?foo=bar" autorequest="true" ondata="canvas.display(this)"/> <method name="display" args="dset"><![CDATA[ var data = dset.data; var method = data.getElementsByTagName("method")[0].childNodes[0].data var queryString = data.getElementsByTagName("queryString")[0].childNodes if (queryString) queryString = queryString[0].data; var params = data.getElementsByTagName("parameters")[0].childNodes; Debug.write("==== " + dset.name + " ===="); Debug.write("[ request ]"); Debug.write("src:", dset.src); Debug.write("querytype:", dset.querytype); Debug.write(""); Debug.write("[ response ]"); Debug.write("method:", method); Debug.write("queryString:", queryString); Debug.write("parameters:"); for (var i=0; i < params.length; i++) { var p = params[i].childNodes[0].data; Debug.write(i + ": " + p); } Debug.write(""); Debug.write(""); ]]> </method> </canvas> Henry, may you take a look at this thread (http://forum.openlaszlo.org/showthread.php?t=11205), especially the last two inputs.
Users report POST is broken in their apps when they compile to 4.0.9. A further complication has arisen, I am not sure if I should file a separate bug on this.
There turns out to be a behavior in the Flash player (swf7,8, and 9 actually) where if you ask it to do a POST request, but try to post with an empty data object (LoadVars in swf8, or URLLoader/URLRequest in swf9) , then the Flash player actually makes a GET request. When I made the changes for preserving the URL query string, that had the side effect of moving the "__lzbc__" 'cache-breaking' timestamp query arg from the body of POST requests into the URL query string, which caused requests with no data parameters to get turned into GET requests by the Flash player. So, I am going to add a patch which, for POST requests, always puts the cache-breaking timestamp into the POST data. ------------------------------------------------------------------------
r8100 | hqm | 2008-02-25 15:17:25 -0500 (Mon, 25 Feb 2008) | 54 lines Changed paths: M /openlaszlo/trunk/WEB-INF/lps/lfc/data/LzHTTPDataProvider.lzs Change 20080225-hqm-q by hqm@badtzmaru.local on 2008-02-25 12:53:53 EST in /Users/hqm/openlaszlo/trunk/WEB-INF/lps/lfc for http://svn.openlaszlo.org/openlaszlo/trunk/WEB-INF/lps/lfc Summary: add data to SWF SOLO POST requests, to ensure that Flash does not turn them into GET requests New Features: Bugs Fixed: Technical Reviewer: andre QA Reviewer: max Doc Reviewer: (pending) Documentation: Release Notes: Details: SWF has a bug whereby an empty POST request (one with no data in it's post body) is turned into a GET request. This behavior used to be masked by the way we forced all query args into the POST body for post requests. When a change was made to separate and preserve the query string from the post body, people started reporting bugs where their POST requests became GET requests. This patch forces the "__lzbc__" cache-breaking timestamp arg into the body of SOLO post requests. I am wondering if I should also force this value into Proxied requests, to ensure that the data which is sent to the back-end is consistent between SOLO and proxied requests. This patch only forces the __lzbc__ arg into SOLO POST requests. Tests: test.lzx: <canvas debug="true" height="400" > <dataset name="dsGroups" request="true" type="http" src="test.jsp" querytype="POST"/> <datapointer name="dpGroups" xpath="dsGroups:/Response"> <handler name="ondata" args="d"><![CDATA[ Debug.write(d); ]]></handler> </datapointer> </canvas> test.jsp: <%System.out.println(request.getMethod());%> <Response> <Method><%=request.getMethod()%></Method> </Response> | |||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Echo.jsp returns the query string and query param it received from the LZX file.