Complete analysis and design section of writeup

This commit is contained in:
Aadi Desai 2023-06-21 11:55:11 +01:00
parent 7a7d96abf3
commit 70dda89602
Signed by: supleed2
SSH key fingerprint: SHA256:CkbNRs0yVzXEiUp2zd0PSxsfRUMFF9bLlKXtE1xEbKM

View file

@ -1,7 +1,6 @@
Write-up of `FPGA Accelerator for StackSynth`
- TODOs
- Analysis and Design Section
- Reduce use of backticks
- Move large listings to appendix
- Measure SNR
@ -229,7 +228,7 @@ The key components of LiteX used in this project are:
- `ClockDomain`: creates a new clock domain, used for the DAC system clock, driven at 36.864MHz as indicated in the PCM1780 datasheet for a 48kHz sample rate
- `Subsignal`: defines collections of signals for easier pin assignment within modules
- `LiteScopeAnalyzer`: a logic analyser placed alongside the SoC, sampling any selected signals within the design at the system clock frequency, with values stored in Block RAM and converted a VCD waveform file which can be viewed in GTKWave
- `Builder`: converts the design object to a Verilog and invokes Yosys and nextpnr to synthesize and generate the FPGA bitstream
- `Builder`: converts the design object to a Verilog module and invokes Yosys and nextpnr to synthesize and generate the FPGA bitstream
- `Module`: creates a custom module that can be instanced and added as a submodule to other modules or the `BaseSoC`
- `ModuleDoc`: inheriting from this class results in the class docstring being used in the autogenerated documentation, allowing the documentation of a module to be placed alongside the module definition
- `CSRStorage`: register object that is read/write from the CPU and read-only from custom logic
@ -334,17 +333,17 @@ This section presents a high-level overview of the design of the system, and det
Figure x.y [below] is a block diagram representation of the StackSynth FPGA Extension board including SoC and external Integrated Circuit components that are integral to the project function. Dotted lines represent analogue signals, which includes the stereo audio signals from the PCM1780 DAC, through the DS1881E digital potentiometer and through the TS482 amplifier and 3.5mm headphone port. Thinner solid lines are single bit digital signals, including clock signals and serial bit connections, while thicker solid lines are multi-bit digital signals or buses, including UART and the CSR bus. Later in the project, the VexRiscV CPU was replaced with a PicoRV32 CPU for testing a basic software implementation of interrupts, however the overall architecture of the system remained unchanged.
![System Architecture Overview](notes/system-overview.png)
The block diagram is also colour coded to represent the different areas of the system, with physical components confirmed at the start of the project in red, parts of the OrangeCrab in orange, LiteX provided modules on the FPGA in blue, and modules created in this project in green. The FTDI USB to UART adapter is shown in the diagram as it is used to download traces from the LiteScope Analyzer, however it was not provided as part of the project and is external to the StackSynth FPGA Extension board.
In Figure x.y, the `Wave Sample Generator Block` represents a conversion from settings controlled from the CPU via the CSR bus to the final output samples sent to the DAC. A key design decision within this block is the generation of sample values when required without the use of a large wave-table. The OrangeCrab has limited Block RAM and a large memory would be required to provide the resolution desired for phase to sine wave conversion, for example, using a 16-bit phase to index a table with 2^16 or 65536 entries would require 1049Kb of Block RAM, more than the 1008Kb available on the ECP5 model used. Instead, a larger phase accumulator can be used, allowing for more precise phase steps providing better accuracy as the error is smaller, and minor errors due to rounding are averaged out over multiple cycles, reducing the likelihood of audible glitches. This phase accumulator can then be truncated to 16 bits by ignoring the lower 8 bits and then used for sample generation.
![System Architecture Overview](notes/systemOverview.png)
> Examiners are just as interested in the process as the end result, include design decisions, the available options and reasons for particular choices (critical assessment). Explain trade-offs including those out of your control.
In Figure x.y, the `Wave Sample Generator Block` represents a conversion from settings controlled from the CPU via the CSR bus to the final output samples sent to the DAC. A key design decision within this block is the generation of sample values when required without the use of a large wave-table. The OrangeCrab has limited Block RAM and a large memory would be required to provide the resolution desired for phase to sine wave conversion, for example, using a 16-bit phase to index a table with 2^16 or 65536 entries would require 1049Kb of Block RAM, more than the 1008Kb available on the ECP5 model used. Instead, a larger phase accumulator can be used, allowing for more precise phase steps providing better accuracy as the error is smaller, and minor errors due to rounding are averaged out over multiple cycles, reducing the likelihood of audible glitches. This phase accumulator can then be truncated to 16 bits by ignoring the lower 8 bits and used for sample generation. Figure x.z shows the submodules within the `Wave Sample Generator Block`, including the CORDIC and GenerateWave modules.
- TODO: More description on block diagram?
[Figure: Wave Sample Generator Block internals]
- TODO: Colour coded diagram to show what is happening solely within FPGA and not on board
![Wave Sample Generator Block internals](notes/sampleGenerator.png)
- TODO: Diagram for wave sample generator block?
The `Async FIFO` block handles the transfer of generated samples from the system 48MHz clock domain to the 36.864MHz DAC clock domain, as the write port is driven by the system clock and the read port is driven by the DAC clock. This block is required as the DAC clock is not a multiple of the system clock, nor does it divide from the system clock, so multiple buffers may not prevent metastability. The samples are then fed into the DAC Driver which uses an internal counter to generate the bit clock at 2.304MHz and left-right clock at 48kHz.
The final major design choice in this project is to use SystemVerilog (IEEE 1800-2017), including constructs such as `always_comb` and `always_ff` blocks over Verilog `always` blocks and `logic` over `wire` or `reg`. This choice was made for a number of reasons, including the extra compile time checks and readability as the block is immediately identifiable as combinatorial or synchronous logic and the ability to use newer open source tools for checking code quality and semantic correctness when writing the required blocks for logic not already provided by LiteX. However, the SystemVerilog constructs supported by the open-source version of Yosys used in Project Trellis are limited, so the code must still be written so that it can be synthesised by Yosys.