m1ezv
no way to compare when less than two revisions
Differences
This shows you the differences between two versions of the page.
— | m1ezv [2024/12/09 01:06] (current) – [Implementation and execution on a FPGA board] 94.103.125.62 | ||
---|---|---|---|
Line 1: | Line 1: | ||
+ | # RVCore Project, Arch Lab, Tokyo Tech | ||
+ | |||
+ | The RVCore Project is a research and development project | ||
+ | of the RISC-V soft processor highly optimized for FPGAs. | ||
+ | |||
+ | **RVCoreP** (**R**ISC-**V** **Core** **P**ipelined version) is one of the RISC-V soft processor cores of the RVCore Project. | ||
+ | It is an optimized RISC-V soft processor of five-stage pipelining. | ||
+ | |||
+ | {{: | ||
+ | |||
+ | **RVCoreP supports the following FPGA boards!** | ||
+ | |||
+ | - [[https:// | ||
+ | - [[https:// | ||
+ | |||
+ | |||
+ | |||
+ | ## What's new | ||
+ | |||
+ | - 2020/06/25 : Released Ver.0.5.3 and support the Arty A7-35T FPGA board | ||
+ | - [[old2|2020/ | ||
+ | - 2020/05/17 : Change of web page structure and release of Ver.0.5.1 | ||
+ | - [[old1|2020/ | ||
+ | - 2020/03/04 : This page is released about Ver.0.4.6 ! | ||
+ | |||
+ | |||
+ | |||
+ | ## About RVCoreP | ||
+ | |||
+ | The main specifications of RVCoreP are shown below: | ||
+ | |||
+ | - An optimized RISC-V soft processor | ||
+ | - Adopt **RV32I** of RISC-V as an instruction set architecture, | ||
+ | - Adopt **five-stage pipelining** | ||
+ | - Instruction fetch (If) | ||
+ | - Instruction decode (Id) | ||
+ | - Instruction execution (Ex) | ||
+ | - Memory access (Ma) | ||
+ | - Write back (Wb) | ||
+ | - Apply **three effective optimization methods** to improve the operating frequency | ||
+ | - Instruction fetch unit optimization including the pipelined branch prediction mechanism | ||
+ | - ALU optimization | ||
+ | - Data alignment and sign-extension optimization for data memory output | ||
+ | - Implemented in **Verilog HDL** | ||
+ | - Run RISC-V programs compiled with RV32I | ||
+ | - By Verilog HDL simulation using Verilator or Icarus Verilog | ||
+ | - On the FPGA boards including **Xilinx Artix-7 FPGA** | ||
+ | |||
+ | ## Download source file | ||
+ | |||
+ | The latest version of RVCoreP is Ver.0.5.3: {{ : | ||
+ | |||
+ | The other old versions are as follows: | ||
+ | |||
+ | - Ver.0.5.1: {{ : | ||
+ | - Ver.0.4.6: {{ : | ||
+ | |||
+ | This source code is released under the MIT License, see LICENSE.txt. | ||
+ | |||
+ | ### Change log | ||
+ | |||
+ | - Ver.0.5.3 : The version that supports Arty A7-35T FPGA board | ||
+ | - Ver.0.5.1 : The version supporting Verilator, Embench, pySerial, Vivado 2019.2 | ||
+ | - Ver.0.4.6 : The version used in our submitted manuscript | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ## Recommended environment | ||
+ | |||
+ | - Ubuntu 18.04 LTS | ||
+ | - [[https:// | ||
+ | - [[http:// | ||
+ | - [[https:// | ||
+ | - [[https:// | ||
+ | - Python 3.6.9 | ||
+ | - [[https:// | ||
+ | |||
+ | ### Install command | ||
+ | |||
+ | Install verilator by the following command. | ||
+ | |||
+ | ``` | ||
+ | $ sudo apt install verilator | ||
+ | ``` | ||
+ | |||
+ | Install iverilog by the following command. | ||
+ | |||
+ | ``` | ||
+ | $ sudo apt install iverilog | ||
+ | ``` | ||
+ | |||
+ | Install pySerial by the following command. | ||
+ | |||
+ | ``` | ||
+ | $ sudo apt install python3-pip | ||
+ | $ pip3 install pyserial | ||
+ | ``` | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ## Getting started guide | ||
+ | |||
+ | (1) Download the source code of the RVCoreP | ||
+ | |||
+ | ``` | ||
+ | $ wget http:// | ||
+ | ``` | ||
+ | |||
+ | (2) Extract the downloaded zip file | ||
+ | |||
+ | ``` | ||
+ | $ unzip rvcorep_ver053.zip | ||
+ | $ cd rvcorep_ver053 | ||
+ | ``` | ||
+ | |||
+ | ### Verilog HDL simulation using Verilator | ||
+ | |||
+ | You execute the following commands on the recommended environment. | ||
+ | |||
+ | (1) Compile source code written in Verilog HDL using Verilator | ||
+ | |||
+ | ``` | ||
+ | $ make | ||
+ | verilator -DSERIAL_WCNT=2 -DNO_IP --public --top-module top --clk CLK --x-assign 0 --x-initial 0 --no-threads -O2 -Wno-WIDTH -Wno-UNSIGNED --exe sim.cpp --cc top.v main.v uart.v debug.v proc.v | ||
+ | make -j -C obj_dir -f Vtop.mk Vtop | ||
+ | cp obj_dir/ | ||
+ | ``` | ||
+ | |||
+ | The executable file `simv` is generated after the compilation is performed. | ||
+ | |||
+ | (2) Execute the Verilog HDL simulation | ||
+ | |||
+ | By default, the test benchmark is executed. | ||
+ | The memory file of the test benchmark is `test/ | ||
+ | |||
+ | ### Execution result when running `test/ | ||
+ | |||
+ | ``` | ||
+ | $ make run | ||
+ | ./simv | ||
+ | Run test/ | ||
+ | Initializing : .......... | ||
+ | -------------------------------------------------- | ||
+ | ---- nqueen ---- | ||
+ | Nqueen : | ||
+ | N = 6 | ||
+ | The number of solutions = 4 | ||
+ | ---------------- | ||
+ | |||
+ | ---- qsort ---- | ||
+ | Sorted Seqence : | ||
+ | 59321 | ||
+ | A4C86 | ||
+ | AC7D3 | ||
+ | B210A | ||
+ | 142044 | ||
+ | 1DEC15 | ||
+ | 1EC216 | ||
+ | 2536B2 | ||
+ | 278BCF | ||
+ | 34A2AC | ||
+ | ---------------- | ||
+ | |||
+ | ---- fib ---- | ||
+ | Fibonacci Seqence : | ||
+ | 1: 1 | ||
+ | 2: 1 | ||
+ | 3: 2 | ||
+ | 4: 3 | ||
+ | 5: 5 | ||
+ | 6: 8 | ||
+ | 7: D | ||
+ | 8: 15 | ||
+ | 9: 22 | ||
+ | A: 37 | ||
+ | ---------------- | ||
+ | |||
+ | ---- acker ---- | ||
+ | acker(0,0) = 1 | ||
+ | acker(0,1) = 2 | ||
+ | acker(0,2) = 3 | ||
+ | acker(1,0) = 2 | ||
+ | acker(1,1) = 3 | ||
+ | acker(1,2) = 4 | ||
+ | acker(2,0) = 3 | ||
+ | acker(2,1) = 5 | ||
+ | acker(2,2) = 7 | ||
+ | ---------------- | ||
+ | |||
+ | == elapsed clock cycles | ||
+ | == valid instructions executed | ||
+ | == IPC : | ||
+ | == branch prediction hit : | ||
+ | == branch prediction miss : 1430 | ||
+ | == branch prediction total : | ||
+ | == branch prediction hit rate : 0.716 | ||
+ | == the num of load-use stall : | ||
+ | == estimated clock cycles | ||
+ | == r_cnt : | ||
+ | == r_rout | ||
+ | - top.v:154: Verilog $finish | ||
+ | ``` | ||
+ | |||
+ | You will see the above output. | ||
+ | The information such as IPC (Instructions Per Cycle) and branch prediction hit rate is output to the console after running simulation. | ||
+ | |||
+ | |||
+ | |||
+ | (3) Execute the Dhrystone and Coremark benchmarks by the Verilog HDL simulation | ||
+ | |||
+ | You compile and execute with Dhrystone and Coremark benchmarks. | ||
+ | |||
+ | The memory files, the binary files, and the dump files are stored in the `bench/` directory. | ||
+ | |||
+ | For Dhrystone, three configurations are prepared by the parameter `NUMBER_OF_RUNS`. | ||
+ | |||
+ | - benchd | ||
+ | - benchd2 : NUMBER\_OF\_RUNS=2000 | ||
+ | - benchd3 : NUMBER\_OF\_RUNS=10000 | ||
+ | |||
+ | For Coremark, three configurations are also prepared by the parameter `ITERATIONS`. | ||
+ | |||
+ | - benchc | ||
+ | - benchc2 : ITERATIONS=2 | ||
+ | - benchc3 : ITERATIONS=10 | ||
+ | |||
+ | These benchmarks can be run with the following commands. | ||
+ | |||
+ | ``` | ||
+ | $ make [configuration name] | ||
+ | $ make run | ||
+ | ``` | ||
+ | |||
+ | (4) Execute the [[https:// | ||
+ | |||
+ | You compile and execute with 19 benchmarks in Embench. | ||
+ | |||
+ | The memory files, the binary files, and the dump files are stored in the `embench/` directory. | ||
+ | |||
+ | These benchmarks can be run with the following commands. | ||
+ | |||
+ | ``` | ||
+ | $ make [benchmark name] | ||
+ | $ make run | ||
+ | ``` | ||
+ | |||
+ | The execution results of dhrystone3, coremark3, and 19 benchmarks of Embench are summarized in the file `result.txt`. | ||
+ | |||
+ | |||
+ | ### Implementation and execution on a FPGA board | ||
+ | |||
+ | You execute the following process in the directory of the downloaded source code on the recommended environment. | ||
+ | |||
+ | (1) Open the project file `main.xpr` in Xilinx Vivado | ||
+ | |||
+ | ``` | ||
+ | $ vivado main.xpr & | ||
+ | ``` | ||
+ | |||
+ | (2) Perform logic synthesis, placement and routing, and generating bitstream using Xilinx Vivado | ||
+ | |||
+ | Use the following set of logic synthesis and place-and-route according to the FPGA board you want to use. | ||
+ | |||
+ | - When using Nexys 4 DDR board : synth\_1 and impl\_1 | ||
+ | - When using Arty A7-35T board : synth\_2 and impl\_2 | ||
+ | |||
+ | Execute the following process according to the used set of logic synthesis and placement and routing. | ||
+ | |||
+ | - Right click on synth_* for logic synthesis and select & | ||
+ | - Click & | ||
+ | |||
+ | By default, the operating frequency of the processor is set to 160MHz. | ||
+ | |||
+ | In the synth\_2 for Arty A7 board, | ||
+ | the following command is added to the option of synth\_2, the strategy of logic synthesis. | ||
+ | |||
+ | - `-verilog_define ARTYA7=1` | ||
+ | |||
+ | The source code is switched for Arty A7 board by the definition of this macro `ARTYA7`, | ||
+ | so please be careful when changing the logic synthesis strategy yourself. | ||
+ | |||
+ | (3) Write the generated bitstream to the FPGA board | ||
+ | |||
+ | - Click & | ||
+ | |||
+ | - Click & | ||
+ | |||
+ | - Click & | ||
+ | |||
+ | - Click & | ||
+ | |||
+ | When the bitstream data is correctly written to the Nexys 4 DDR board, | ||
+ | the DONE LED lights up and & | ||
+ | |||
+ | (4) Prepare for 8M baud serial communication | ||
+ | |||
+ | Please move to the `test/` directory for the following processes. | ||
+ | |||
+ | ``` | ||
+ | $ cd test/ | ||
+ | ``` | ||
+ | |||
+ | For serial communication with RVCoreP, the pySerial program `serial_rvcorep.py` is used. | ||
+ | |||
+ | This program sends a binary file to the FPGA board and outputs the received characters via pySerial. | ||
+ | |||
+ | Before executing this program, check the setting of the port used for serial communication in the program. | ||
+ | |||
+ | By default, `/ | ||
+ | |||
+ | The usage of this program is as follows: | ||
+ | |||
+ | ``` | ||
+ | $ python3 serial_rvcorep.py [serial baud rate (Mbaud)] [binary file name] | ||
+ | ``` | ||
+ | |||
+ | (5) Send the RISC-V program binary to the FPGA board by serial communication and execute the program | ||
+ | |||
+ | To send the test program `test.bin` and start serial communication, | ||
+ | |||
+ | ``` | ||
+ | $ python3 serial_rvcorep.py 8 & | ||
+ | ``` | ||
+ | |||
+ | or | ||
+ | |||
+ | ``` | ||
+ | $ python3 serial_rvcorep.py | ||
+ | ``` | ||
+ | |||
+ | - Send the test benchmark to the FPGA board via serial communication and execute the program on the implemented processor | ||
+ | |||
+ | The following execution result is output via serial communication. | ||
+ | |||
+ | ### Execution result when running `test.bin` | ||
+ | |||
+ | ``` | ||
+ | $ python3 serial_rvcorep.py | ||
+ | serial baud rate : 8000000 | ||
+ | send file : test.bin | ||
+ | ---- nqueen ---- | ||
+ | Nqueen : | ||
+ | N = 6 | ||
+ | The number of solutions = 4 | ||
+ | ---------------- | ||
+ | |||
+ | ---- qsort ---- | ||
+ | Sorted Seqence : | ||
+ | 59321 | ||
+ | A4C86 | ||
+ | AC7D3 | ||
+ | B210A | ||
+ | 142044 | ||
+ | 1DEC15 | ||
+ | 1EC216 | ||
+ | 2536B2 | ||
+ | 278BCF | ||
+ | 34A2AC | ||
+ | ---------------- | ||
+ | |||
+ | ---- fib ---- | ||
+ | Fibonacci Seqence : | ||
+ | 1: 1 | ||
+ | 2: 1 | ||
+ | 3: 2 | ||
+ | 4: 3 | ||
+ | 5: 5 | ||
+ | 6: 8 | ||
+ | 7: D | ||
+ | 8: 15 | ||
+ | 9: 22 | ||
+ | A: 37 | ||
+ | ---------------- | ||
+ | |||
+ | ---- acker ---- | ||
+ | acker(0,0) = 1 | ||
+ | acker(0,1) = 2 | ||
+ | acker(0,2) = 3 | ||
+ | acker(1,0) = 2 | ||
+ | acker(1,1) = 3 | ||
+ | acker(1,2) = 4 | ||
+ | acker(2,0) = 3 | ||
+ | acker(2,1) = 5 | ||
+ | acker(2,2) = 7 | ||
+ | ---------------- | ||
+ | ``` | ||
+ | |||
+ | When using Nexys 4 DDR board, The 7-segment LED shows the value of the program counter of the processor at the end of execution. | ||
+ | If you execute the test benchmark `test/ | ||
+ | |||
+ | When using Nexys 4 DDR board and the button " | ||
+ | If you execute the test benchmark `test/ | ||
+ | |||
+ | |||
+ | This program is terminated by Ctrl-C. | ||
+ | |||
+ | If you want to send the binary file of the program to the FPGA board again and execute it, proceed from step (3). | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | ## How to build the RISC-V cross compiler and RISC-V binary files that works with RVCoreP | ||
+ | |||
+ | Please refer the fllowing page: [[binary|How to build the RISC-V cross compiler and RISC-V binary files that works with RVCoreP]] | ||
+ | |||
+ | |||
+ | |||
+ | ## Publication | ||
+ | |||
+ | - [[https:// | ||
+ | |||
+ | - Hiromu Miyazaki, Takuto Kanamori, Md Ashraful Islam, Kenji Kise: RVCoreP : An optimized RISC-V soft processor of five-stage pipelining, [[https:// | ||
+ | |||
+ | |||
+ | |||
+ | ## Contact | ||
+ | |||
+ | [[http:// | ||
+ | |||
+ | Maintainer : (E-mail) riscv-support (at) arch.cs.titech.ac.jp | ||
+ | |||
+ | Contributor : Hiromu Miyazaki, Takuto Kanamori, Md Ashraful Islam, Kenji Kise | ||
+ | |||
+ | |||
+ | |||
+ | ## Other Project | ||
+ | |||
+ | - [[http:// | ||
+ | |||
+ | Copyright (c) 2020 Kise Laboratory, Tokyo Institute of Technology | ||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
+ | |||
m1ezv.txt · Last modified: 2024/12/09 01:06 by 94.103.125.62