ci_udc only allocates a single QTD structure per EP. All data needs to be
extracted from the DTD prior to calling ci_ep_submit_next_request(), since
that fills the QTD with next transaction's parameters. Fix
handle_ep_complete() to extract the transaction (remaining) length before
kicking off the next transaction.
In practice, this only causes writes to UMS devices to fail for me. I may
have tested the final versions of my previous ci_udc patch only with
reads. More recently, I had patches applied locally that allocated a QTD
per USB request rather than per USB EP, although since that doesn't give
any performance benefit, I'm dropping those.
Fixes: 2813006fec ("usb: ci_udc: allow multiple buffer allocs per ep")
Signed-off-by: Stephen Warren <swarren@nvidia.com>