**This is an old revision of the document!**
Table of Contents
This old page was copied on May 17, 2020.
RVCore Project, Arch Lab, Tokyo Tech
The RVCore Project is a research and development project of the RISC-V soft processor highly optimized for FPGAs.
RVCoreP (RISC-V Core Pipelined version) is one of the RISC-V soft processor cores of the RVCore Project. It is an optimized RISC-V soft processor of five-stage pipelining.
About RVCoreP
The main specifications of RVCoreP are shown below:
- An optimized RISC-V soft processor
- Adopt RV32I of RISC-V as an instruction set architecture, which is the basic 32-bit integer instruction set
- Adopt five-stage pipelining
- Instruction fetch (If)
- Instruction decode (Id)
- Instruction execution (Ex)
- Memory access (Ma)
- Write back (Wb)
- Apply three effective optimization methods to improve the operating frequency
- Instruction fetch unit optimization including the pipelined branch prediction mechanism
- ALU optimization
- Data alignment and sign-extension optimization for data memory output
- Implemented in Verilog HDL
- Run RISC-V programs compiled with RV32I
- By Verilog HDL simulation using Icarus Verilog or Verilator
- On the FPGA boards including Xilinx Artix-7 FPGA
Download source file
The source code of RVCoreP Ver.0.4.6: rvcorep_ver046.zip
This source code is released under the MIT License, see LICENSE.txt.
Recommended environment
- Ubuntu 18.04 LTS for executing Xilinx Vivado
- Windows 10 for serial communication
- Icarus Verilog for Verilog HDL simulation
- Verilator for Verilog HDL simulation, optional
- Xilinx Vivado Design Suite 2017.2 for logic synthesis
- Nexys 4 DDR board with Xilinx Artix-7 FPGA for placement and routing
- Tera Term for serial communication with FPGA board
Install iverilog by the following command.
$ sudo apt install iverilog
Install verilator by the following command, the install of verilator is optional.
$ sudo apt install verilator
Getting started guide
(1) Download the source code of the RVCoreP
$ wget http://www.arch.cs.titech.ac.jp/wk/rvcore/lib/exe/fetch.php?media=rvcorep_ver046.zip -O rvcorep_ver046.zip
(2) Extract the downloaded zip file
$ unzip rvcorep_ver046.zip $ cd rvcorep_ver046
Verilog HDL simulation using Icarus Verilog
You execute the following commands on the recommended environment.
(1) Compile source code written in Verilog HDL using Icarus Verilog
$ make iverilog -DSERIAL_WCNT=2 -DNO_IP -o simv top.v main.v uart.v debug.v proc.v
The executable file simv
is generated after the compilation is performed.
(2) Execute the Verilog HDL simulation
By default, the test benchmark is executed.
The memory file of the test benchmark is test/test.mem
.
$ make run ./simv Run test/test.mem Initializing : .......... -------------------------------------------------- ---- nqueen ---- Nqueen : N = 6 The number of solutions = 4 ---------------- ---- qsort ---- Sorted Seqence : 59321 A4C86 AC7D3 B210A 142044 1DEC15 1EC216 2536B2 278BCF 34A2AC ---------------- ---- fib ---- Fibonacci Seqence : 1: 1 2: 1 3: 2 4: 3 5: 5 6: 8 7: D 8: 15 9: 22 A: 37 ---------------- ---- acker ---- acker(0,0) = 1 acker(0,1) = 2 acker(0,2) = 3 acker(1,0) = 2 acker(1,1) = 3 acker(1,2) = 4 acker(2,0) = 3 acker(2,1) = 5 acker(2,2) = 7 ---------------- == elapsed clock cycles : 35030 == valid instructions executed : 28934 == IPC : 0.825 == the num of load-use stall : 1897 == branch prediction hit : 3615 == branch prediction miss : 1430 == branch prediction total : 5045 == branch prediction hit rate : 0.716 == estimated clock cycles : 35121 == r_cnt : 000088d6 == r_rout : 000000a0
You will see the above output. The information such as IPC (Instructions Per Cycle) and branch prediction hit rate is output to the console after running simulation.
(3) Execute the Dhrystone and Coremark benchmarks by the Verilog HDL simulation
You compile and execute with Dhrystone benchmark.
The memory file is bench/dhrystone.mem
.
$ make dhrystone make BENCH="bench/dhrystone.mem" iverilog -DSERIAL_WCNT=2 -DNO_IP -DMEMFILE=\"bench/dhrystone.mem\" -DMEM_SIZE=1024*32 -DNO_SERIAL -DPROGRESS -o simv top.v main.v uart.v debug.v proc.v $ make run ./simv Run bench/dhrystone.mem Initialized. -------------------------------------------------- ............................................................................................ == elapsed clock cycles : 973054 == valid instructions executed : 909443 == IPC : 0.934 == the num of load-use stall : 18174 == branch prediction hit : 201153 == branch prediction miss : 16481 == branch prediction total : 217634 == branch prediction hit rate : 0.924 == estimated clock cycles : 977060 == r_cnt : 000ed8fe == r_rout : 0000124c
You also compile and execute with Coremark benchmark.
The memory file is bench/coremark.mem
.
$ make coremark make BENCH="bench/coremark.mem" iverilog -DSERIAL_WCNT=2 -DNO_IP -DMEMFILE=\"bench/coremark.mem\" -DMEM_SIZE=1024*32 -DNO_SERIAL -DPROGRESS -o simv top.v main.v uart.v debug.v proc.v $ make run ./simv Run bench/coremark.mem Initialized. -------------------------------------------------- ...................................................................................................................................................... ................................. == elapsed clock cycles : 1799505 == valid instructions executed : 1481298 == IPC : 0.823 == the num of load-use stall : 34930 == branch prediction hit : 363439 == branch prediction miss : 94534 == branch prediction total : 457973 == branch prediction hit rate : 0.793 == estimated clock cycles : 1799830 == r_cnt : 001b7551 == r_rout : 00002fe0
Implementation and execution on the Nexys 4 DDR board
You execute the following process in the directory of the downloaded source code on the recommended environment.
(1) Open the project file main.xpr
in Xilinx Vivado
$ vivado main.xpr
(2) Perform logic synthesis, placement and routing, and generating bitstream using Xilinx Vivado
- Click “Generate Bitstream” in Vivado project manager
By default, the operating frequency of the processor is set to 170MHz.
(3) Write the generated bitstream to the FPGA board
- Click “Open Hardware Manager” in Vivado project manager to open the hardware manager
- Click “Open target” and “Auto Connect” to recognize the FPGA board
- Click “Program device” and specify Bitstream file
- Click “Program” to write bitstream to FPGA board
When the bitstream data is correctly written to the FPGA board, the DONE LED lights up and “00000000” is displayed on the 8-digit 7-segment LEDs.
(4) Prepare for 5M baud serial communication
Open a terminal emulator Tera Term that can perform serial communication
- Click “File”→“New connection” to make a new connection
- Select the appropriate USB Serial Port and click “OK” (When using Windows, it seems better to select the larger value of COM[XX] of the serial COM port)
- Click “Setup”→“Terminal”
- Change “Receive” and “Transmit” in New-line to “LF” and click “OK”
- Click “Setup”→“Serial port”
- Change the Baud rate of Serial port to “5000000” and click “OK”
(5) Send the RISC-V program binary to the FPGA board by serial communication and execute the program
- Click “File”→“Send file” on Tera Term
- Check the “Binary” checkbox in the “Option” column
- Select
test/test.bin
as file name and click “Open” - Send the test benchmark to the FPGA board via serial communication and execute the program on the implemented processor
The following execution result is output via serial communication.
---- nqueen ---- Nqueen : N = 6 The number of solutions = 4 ---------------- ---- qsort ---- Sorted Seqence : 59321 A4C86 AC7D3 B210A 142044 1DEC15 1EC216 2536B2 278BCF 34A2AC ---------------- ---- fib ---- Fibonacci Seqence : 1: 1 2: 1 3: 2 4: 3 5: 5 6: 8 7: D 8: 15 9: 22 A: 37 ---------------- ---- acker ---- acker(0,0) = 1 acker(0,1) = 2 acker(0,2) = 3 acker(1,0) = 2 acker(1,1) = 3 acker(1,2) = 4 acker(2,0) = 3 acker(2,1) = 5 acker(2,2) = 7 ----------------
The 7-segment LED shows the value of the program counter of the processor at the end of execution.
If you execute the test benchmark test/test.bin
, the output to the 7-segment LEDs is 000000A0
.
When the button “BTNU” is pressed, the 7-segment LED shows the number of execution cycles.
If you execute the test benchmark test/test.bin
, the output to the 7-segment LEDs is 000088D6
.
If you want to send the binary file of the program to the FPGA board again and execute it, proceed from step (3).
Publication
This processor RVCoreP is explicated in a preprint paper of arXiv.
Hiromu Miyazaki, Takuto Kanamori, Md Ashraful Islam, Kenji Kise: RVCoreP : An optimized RISC-V soft processor of five-stage pipelining, arXiv:2002.03568 [cs.AR] (2020-02-10).
This paper is submitted to the Institute of Electronics, Information and Communication Engineers (IEICE).
Contact
Kise Laboratory, Department of Computer Science, School of Computing, Tokyo Institute of Technology (Tokyo Tech)
E-mail: miyazaki (at) arch.cs.titech.ac.jp
Other Project
Copyright © 2020 Kise Laboratory, Tokyo Institute of Technology