[Top][All Lists]

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Discuss-gnuradio] USB Issues

From: Eric Blossom
Subject: Re: [Discuss-gnuradio] USB Issues
Date: Sat, 14 Jan 2006 15:10:50 -0800
User-agent: Mutt/1.5.6i

On Sat, Jan 14, 2006 at 01:55:50PM -0500, Michael Dickens wrote:
> [I sent a similar this out just under 1 week ago, and have received  
> no replies ... so I'm sending this updated message in hopes that  
> someone will share their thoughts on how to proceed. - MLD]

Hi Michael, sorry for the delay getting back to you.  Between an ISP
outage and a cold, I'm a bit behind.

> I've tried all the USB / general coding speed tweaks that I can think  
> of and easily find / work out, all to no avail.  I'm getting about 28  
> MBytes/s on my Mac at home (Dual G4 @ 1.25 GHz, 1.75 GB DRAM @ 167  
> MHz, cheapo PCI USB 2.0 card); and 17-19 MBps on our lab Mac (Dual G5  
> @ 2 GHz, 1 GB DRAM @ 1 GHz, with built-in USB 2.0).


The cheapo PCI USB card could be killing your throughput.  Which
chipset does it use?  Regarding the dual G5, what's the best
throughput *anyone* has achieved by any means?

> I believe that the primary MacOS throughput issue is an added Apple- 
> provided application-layer library/API called "CoreFoundation" which  
> provides "ease of use" to the application writer.  The next layer of
> USB I/O transport is Darwin's "ports", which are part of the mach  
> kernel but can be somewhat accessed by the application layer.

Generally speaking, reliable throughput on the USRP is dominated by
the OS's ability to deliver USB packets with small interpacket gaps.
The EHCI controller does scatter-gather DMA, with a fancy hardware
queuing mechanism.  The hardware (if properly implemented), should be
able to drive the USB at full speed.

Under Linux, we've been able to achieve maximum throughput by ensuring
that the EHCI endpoint queue is never empty.  If this is the case,
then the hardware should "just work", and the OS is pretty much out of
the loop.  We keep the endpoint queue non-empty by submitting multiple
asynchronous requests.  These end up hung off the hardware DMA queue.

> Going to the "ports" directly would definitely speed throughput up,
> but Apple's documentation doesn't give enough info on how to write
> the programs, and I can find no examples of such programming.  Most
> of what I can find is via Apple's Darwin Source Code repository
> http://darwinsource.opendarwin.org, reading through the source
> code to figure out how to do things.  I really shouldn't have to go
> to this level to "just" get async USB to work!

Before heading down the kernel hacking path, I strongly suggest
measuring the actual worst case inter-USB packet delay across the USB
while doing reads and writes.  This will provide insight into whether
the hardware itself is slow, or if there's a driver problem.  The
driver problem will show up as highly variable gaps between packets.

This measurement is pretty easy to make on the USRP with a logic
analyzer.  The signals of interest are the GPIF control signals RD and
WR between the FX2 and FPGA.  These are asserted when a packet is
burst between the FX2 and the FPGA.

 * FPGA input lines that are tied to the FX2 CTLx outputs.
 * These are controlled by the GPIF microprogram...
// WR                                   bmBIT0  // usbctl[0]
// RD                                   bmBIT1  // usbctl[1]
// OE                                   bmBIT2  // usbctl[2]

That is:

  WR is FX2 CTL0 (netname: GPIF_CTL0)
  RD is FX2 CTL1 (netname: GPFI_CTL1)

If you've got fancy probes on your analyzer, you may be able to pick
these off the FX2 directly.  If not, then the easiest thing to do is
to use a basic tx or rx daughterboard, and hack the verilog such that
the signals you are interested in are connected to a daughterboard i/o
pin (see below).  Then connect the LA to the daughterboard i/o pin
headers.  You'll also need to enable the outputs using
u._write_oe(...)  Be careful ;)

See usrp_std.v, look at the instantiation of the master_control module.
Find the signals named debug_{0,4} and change them as needed.  Note
that the rx_buffer.v and tx_buffer.v modules define debug busses that
contain the RD and WR signals. [The standard fpga build wires the
rx_debugbus to the TX_A i/o pins.  RD will be output on i/o pin 0.
You'll just need to set bit 0 in the FR_DEBUG_EN register (see below).]

The values of the debug_{0,4} signals are sent to the daughterboard
pins instead of the "normal values" if the appropriate bit is set in the
FR_DEBUG_EN FPGA register.

// If the corresponding bit is set, internal FPGA debug circuitry
// controls the i/o pins for the associated bank of daughterboard
// i/o pins.  Typically used for debugging FPGA designs.

#define FR_DEBUG_EN             14
#  define bmFR_DEBUG_EN_TX_A           (1 << 0)        // debug controls TX_A 
#  define bmFR_DEBUG_EN_RX_A           (1 << 1)        // debug controls RX_A 
#  define bmFR_DEBUG_EN_TX_B           (1 << 2)        // debug controls TX_B 
#  define bmFR_DEBUG_EN_RX_B           (1 << 3)        // debug controls RX_B 

> If I can't use "ports" directly, then I can do that via a kernel  
> extension (KEXT), or "device driver" in older terms.  

> The primary issue blocking me investigating this option is that GR's
> use of LIBUSB would need to be changed so that all USB calls go
> through the FUSB code (which in turn could call LIBUSB or not as
> needed).  Various initialization routines (e.g. usb_init(),
> usb_find_busses () ) would be moved to the KEXT, requiring new code
> in FUSB to access those KEXT routines.  Right now those routines are
> "hard coded" to use LIBUSB (most are in usrp/host/lib/usrp_prims.cc).

> Thus, IMHO what needs to happen is to route -all- USB calls through  
> FUSB, so that there is 1 place to deal with "everything USB".   

> Thoughts? - MLD

This refactoring would be OK by me.
Please go ahead and do it and mail patch(es) to
address@hidden with the changes.


reply via email to

[Prev in Thread] Current Thread [Next in Thread]