The RVCore Project is a research and development project of the RISC-V soft processor highly optimized for FPGAs.
RVCoreP (RISC-V Core Pipelined version) is one of the RISC-V soft processor cores of the RVCore Project. It is an optimized RISC-V soft processor of five-stage pipelining.
RVCoreP supports the following FPGA boards!
The main specifications of RVCoreP are shown below:
The latest version of RVCoreP is Ver.0.5.3: rvcorep_ver053.zip
The other old versions are as follows:
This source code is released under the MIT License, see LICENSE.txt.
Install verilator by the following command.
$ sudo apt install verilator
Install iverilog by the following command.
$ sudo apt install iverilog
Install pySerial by the following command.
$ sudo apt install python3-pip $ pip3 install pyserial
(1) Download the source code of the RVCoreP
$ wget http://www.arch.cs.titech.ac.jp/wk/rvcore/lib/exe/fetch.php?media=rvcorep_ver053.zip -O rvcorep_ver053.zip
(2) Extract the downloaded zip file
$ unzip rvcorep_ver053.zip $ cd rvcorep_ver053
You execute the following commands on the recommended environment.
(1) Compile source code written in Verilog HDL using Verilator
$ make verilator -DSERIAL_WCNT=2 -DNO_IP --public --top-module top --clk CLK --x-assign 0 --x-initial 0 --no-threads -O2 -Wno-WIDTH -Wno-UNSIGNED --exe sim.cpp --cc top.v main.v uart.v debug.v proc.v make -j -C obj_dir -f Vtop.mk Vtop cp obj_dir/Vtop simv
The executable file simv
is generated after the compilation is performed.
(2) Execute the Verilog HDL simulation
By default, the test benchmark is executed.
The memory file of the test benchmark is test/test.mem
.
$ make run ./simv Run test/test.mem Initializing : .......... -------------------------------------------------- ---- nqueen ---- Nqueen : N = 6 The number of solutions = 4 ---------------- ---- qsort ---- Sorted Seqence : 59321 A4C86 AC7D3 B210A 142044 1DEC15 1EC216 2536B2 278BCF 34A2AC ---------------- ---- fib ---- Fibonacci Seqence : 1: 1 2: 1 3: 2 4: 3 5: 5 6: 8 7: D 8: 15 9: 22 A: 37 ---------------- ---- acker ---- acker(0,0) = 1 acker(0,1) = 2 acker(0,2) = 3 acker(1,0) = 2 acker(1,1) = 3 acker(1,2) = 4 acker(2,0) = 3 acker(2,1) = 5 acker(2,2) = 7 ---------------- == elapsed clock cycles : 35030 == valid instructions executed : 28934 == IPC : 0.825 == branch prediction hit : 3615 == branch prediction miss : 1430 == branch prediction total : 5045 == branch prediction hit rate : 0.716 == the num of load-use stall : 1897 == estimated clock cycles : 35121 == r_cnt : 000088d6 == r_rout : 000000a0 - top.v:154: Verilog $finish
You will see the above output. The information such as IPC (Instructions Per Cycle) and branch prediction hit rate is output to the console after running simulation.
(3) Execute the Dhrystone and Coremark benchmarks by the Verilog HDL simulation
You compile and execute with Dhrystone and Coremark benchmarks.
The memory files, the binary files, and the dump files are stored in the bench/
directory.
For Dhrystone, three configurations are prepared by the parameter NUMBER_OF_RUNS
.
For Coremark, three configurations are also prepared by the parameter ITERATIONS
.
These benchmarks can be run with the following commands.
$ make [configuration name] $ make run
(4) Execute the Embench benchmark by the Verilog HDL simulation
You compile and execute with 19 benchmarks in Embench.
The memory files, the binary files, and the dump files are stored in the embench/
directory.
These benchmarks can be run with the following commands.
$ make [benchmark name] $ make run
The execution results of dhrystone3, coremark3, and 19 benchmarks of Embench are summarized in the file result.txt
.
You execute the following process in the directory of the downloaded source code on the recommended environment.
(1) Open the project file main.xpr
in Xilinx Vivado
$ vivado main.xpr &
(2) Perform logic synthesis, placement and routing, and generating bitstream using Xilinx Vivado
Use the following set of logic synthesis and place-and-route according to the FPGA board you want to use.
Execute the following process according to the used set of logic synthesis and placement and routing.
By default, the operating frequency of the processor is set to 160MHz.
In the synth_2 for Arty A7 board, the following command is added to the option of synth_2, the strategy of logic synthesis.
-verilog_define ARTYA7=1
The source code is switched for Arty A7 board by the definition of this macro ARTYA7
,
so please be careful when changing the logic synthesis strategy yourself.
(3) Write the generated bitstream to the FPGA board
When the bitstream data is correctly written to the Nexys 4 DDR board, the DONE LED lights up and "00000000" is displayed on the 8-digit 7-segment LEDs.
(4) Prepare for 8M baud serial communication
Please move to the test/
directory for the following processes.
$ cd test/
For serial communication with RVCoreP, the pySerial program serial_rvcorep.py
is used.
This program sends a binary file to the FPGA board and outputs the received characters via pySerial.
Before executing this program, check the setting of the port used for serial communication in the program.
By default, /dev/ttyUSB1
is set.
The usage of this program is as follows:
$ python3 serial_rvcorep.py [serial baud rate (Mbaud)] [binary file name]
(5) Send the RISC-V program binary to the FPGA board by serial communication and execute the program
To send the test program test.bin
and start serial communication, execute one of the following commands.
$ python3 serial_rvcorep.py 8 "test.bin"
or
$ python3 serial_rvcorep.py
The following execution result is output via serial communication.
$ python3 serial_rvcorep.py serial baud rate : 8000000 send file : test.bin ---- nqueen ---- Nqueen : N = 6 The number of solutions = 4 ---------------- ---- qsort ---- Sorted Seqence : 59321 A4C86 AC7D3 B210A 142044 1DEC15 1EC216 2536B2 278BCF 34A2AC ---------------- ---- fib ---- Fibonacci Seqence : 1: 1 2: 1 3: 2 4: 3 5: 5 6: 8 7: D 8: 15 9: 22 A: 37 ---------------- ---- acker ---- acker(0,0) = 1 acker(0,1) = 2 acker(0,2) = 3 acker(1,0) = 2 acker(1,1) = 3 acker(1,2) = 4 acker(2,0) = 3 acker(2,1) = 5 acker(2,2) = 7 ----------------
When using Nexys 4 DDR board, The 7-segment LED shows the value of the program counter of the processor at the end of execution.
If you execute the test benchmark test/test.bin
, the output to the 7-segment LEDs is 000000A0
.
When using Nexys 4 DDR board and the button “BTNU” is pressed, the 7-segment LED shows the number of execution cycles.
If you execute the test benchmark test/test.bin
, the output to the 7-segment LEDs is 000088D6
.
This program is terminated by Ctrl-C.
If you want to send the binary file of the program to the FPGA board again and execute it, proceed from step (3).
Please refer the fllowing page: How to build the RISC-V cross compiler and RISC-V binary files that works with RVCoreP
Kise Laboratory, Department of Computer Science, School of Computing, Tokyo Institute of Technology (Tokyo Tech)
Maintainer : (E-mail) riscv-support (at) arch.cs.titech.ac.jp
Contributor : Hiromu Miyazaki, Takuto Kanamori, Md Ashraful Islam, Kenji Kise
Copyright © 2020 Kise Laboratory, Tokyo Institute of Technology