diff --git a/README.md b/README.md
index 0b66a4cbdc0013a89bf11c8b607de4e762c0fca1..a2eb8166f36b29876d54d9eafc181e070fc4a3ca 100644
--- a/README.md
+++ b/README.md
@@ -10,6 +10,7 @@ ch32v003fun contains:
   * An STM32F042 Programmer, the NHC-Link042
   * An ESP32S2 Programmer, the [esp32s2-funprog](https://github.com/cnlohr/esp32s2-cookbook/tree/master/ch32v003programmer)
   * The official WCH Link-E Programmer.
+  * An Arduino-based interface, [Ardulink](https://gitlab.com/BlueSyncLine/arduino-ch32v003-swio).
   * Supports gdbserver-style-debugging for use with Visual Studio.
   * Supports printf-over-single-wire. (At about 400kBaud)
 3. An extra copy of libgcc so you can use unusual risc-v build chains, located in the `misc/libgcc.a`.
@@ -19,6 +20,11 @@ ch32v003fun contains:
 
 In Progress:
 1. Write more demos.
+2. Improve/integrate [rv003usb](https://github.com/cnlohr/rv003usb/)
+
+## Getting Started
+
+For installation instructions, see the [wiki page here](https://github.com/cnlohr/ch32v003fun/wiki/Installation)
 
 ## Features!
 
@@ -38,15 +44,6 @@ You can just try out the debugprintf project, or call SetupDebugPrintf(); and pr
 
 Via gdbserver built into minichlink!  It works with `gdb-multiarch` as well as in Visual Studio Code 
 
-### TODO
-
-## System Prep
-
-For installation instructions, see the [wiki page here](https://github.com/cnlohr/ch32v003fun/wiki/Installation)
-
-
-You can use the pre-compiled minichlink or go to minichlink dir and `make` it.
-
 ## Building and Flashing
 
 ```
@@ -54,87 +51,14 @@ cd examples/blink
 make
 ```
 
-In Linux this will "just work"(TM) using `minichlink`.  
-In Windows, if you want to use minichlink, you will need to use Zadig to install WinUSB to the WCH-Link interface 0.  
-The generated .hex file is compatible with the official WCH flash tool.  
-
 text = code, data = constants and initialization values, bss = uninitialized values.  
 dec is the sum of the 3 and reflects the number of bytes in flash that will get taken up by the program.
 
+The generated .bin is used by minichlink and the .hex file is compatible with the official WCH flash tool.  
 
-## ESP32S2 Programming
-
-## WCH-Link (E)
-
-It enumerates as 2 interfaces.
-0. the programming interface.  I can't get anything except the propreitary interface to work.
-1. the built-in usb serial port. You can hook up UART D5=TX to RX and D6=RX to TX of the CH32V003 for printf/debugging, default speed is 115200. Both are optional, connect what you need.
-
-If you want to mess with the programming code in Windows, you will have to install WinUSB to the interface 0.  Then you can uninstall it in Device Manager under USB Devices.
-
-On linux you find the serial port with `ls -l /dev/ttyUSB* /dev/ttyACM*` and connect to it with `screen /dev/ttyACM0 115200`  
-Disconnect with `CTRL+a` `:quit`.  
-
-Adding your user to these groups will remove the need to `sudo` for access to the serial port:
-debian-based
-	`sudo usermod -a -G dialout $USER`
-arch-based
-	`sudo usermod -a -G uucp $USER`
- 
- You'll need to log out and in to see the change.
-
-
-## WCH-Link Hardware access in WSL
-To use the WCH-Link in WSL, it is required to "attach" the USB hardware on the Windows side to WSL.  This is achieved using a tool called usbipd.
-
-1. On windows side, install the following MSI https://github.com/dorssel/usbipd-win/releases
-2. Install the WSL side client:
-    * For Debian: 
-        `sudo apt-get install usbip hwdata usbutils`
-    * For Arch-based:
-        `sudo pacman -S usbip hwdata usbutils`
-    * For Ubuntu (not tested):
-```
-        sudo apt install linux-tools-5.4.0-77-generic linux-tools-virtual hwdata usbutils
-        sudo update-alternatives --install /usr/local/bin/usbip usbip `ls /usr/lib/linux-tools/*/usbip | tail -n1` 20
-```
-
-3. Plug in the WCH-Link to USB
-4. Run Powershell as admin and use the `usbipd list` command to list all connected devices
-5. Find the this device: `1a86:8010  WCH-Link (Interface 0)` and note the busid it is attached to
-6. In powershell, use the command `usbipd wsl attach --busid=<BUSID>` to attach the device at the busid from previous step
-7. You will hear the windows sound for the USB device being removed (and silently attached to WSL instead)
-8. In WSL, you will now be able to run `lsusb` and see that the SCH-Link is attached
-9. For unknown reasons, you must run make under root access in order to connect to the programmer with minichlink.  Recommend running `sudo make` when building and programming projects using WSL. This may work too (to be confirmed):
-
-### non-root access on linux
-Unlike serial interfaces, by default, the USB device is owned by root, has group set to root and everyone else may only read by default.
-The way to allow non-root users/groups to be able to access devices is via udev rules.
-
-minichlink provides a list of udev rules that allows any user in the plugdev group to be able to interact with the programmers it supports.
-
-You can install and load the required udev rules for minichlink by executing the following commands in the root of this Git repository:
-```
-sudo cp minichlink/99-minichlink.rules /etc/udev/rules.d/
-sudo udevadm control --reload-rules && sudo udevadm trigger
-```
-
-If you add support for another programmer in minichlink, you will need to add more rules here.
-
-**Note:** This readme used to recommend manually making these rules under `80-USB_WCH-Link.rules`. If you wish to use the new rules file shipped in this repo, you may want to remove the old rules file.
-
-
-## minichlink
-
-I wrote some libusb copies of some of the basic functionality from WCH-Link, so you can use the little programmer dongle they give you to program the ch32v003. 
-
-Currently, it ignores all the respone codes, except when querying the chip.  But it's rather surprising how featured I could get in about 5 hours.
-
-Anyone who wants to write a good/nice utility should probably look at the code in this folder.
-
-## VSCode + PlatformIO
+## VSCode +/- PlatformIO
 
-Note: This is genearlly used for CI on this repo.  However, note that this is **not** the path that allows for debugging on Windows.
+Note: With PlatformIO is genearlly used for CI on this repo.  However, note that this is **not** the path that allows for debugging on Windows (For that see [template project](https://github.com/cnlohr/ch32v003fun/tree/master/examples/template/.vscode))
 
 This project can also be built, uploaded and debugged with VSCode and the PlatformIO extension. Simply clone and open this project in VSCode and have the PlatformIO extension installed.
 
@@ -147,9 +71,14 @@ If the C/C++ language server clangd is unable to find `ch32v003fun.h`, the examp
 `build_all_clangd.sh` does in `build scripts` does this for all examples.
 
 ## Quick Reference
- * Needed for programming/debugging: `SWIO` is on `PD1`
- * Optional (not needed, can be configured as output if fuse set): `NRST` is on `PD7`
- * UART TX (optional) is on: `PD5`
+ * **REQUIRED** for programming/debugging: `SWIO` is on `PD1`. Do not re-use PD1 for multiple functions.
+ * **OPTIONAL** `NRST` is on `PD7`. Not needed, defaults as GPIO in some configurations.
+ * **OPTIONAL** UART `TX` is on: `PD5`. We recommend using SWIO for `printf` debugging.
+
+![ch32v003a4m6](https://raw.githubusercontent.com/Tengo10/pinout-overview/main/pinouts/CH32v003/ch32v003a4m6.svg)
+![ch32v003f4p6](https://raw.githubusercontent.com/Tengo10/pinout-overview/main/pinouts/CH32v003/ch32v003f4p6.svg)
+![ch32v003f4u6](https://raw.githubusercontent.com/Tengo10/pinout-overview/main/pinouts/CH32v003/ch32v003f4u6.svg)
+![ch32v003j4m6](https://raw.githubusercontent.com/Tengo10/pinout-overview/main/pinouts/CH32v003/ch32v003j4m6.svg)
 
 ## Support
 
diff --git a/ch32v003fun/ch32v003fun.c b/ch32v003fun/ch32v003fun.c
index 6e6e60d4cd857cfa262c115b104370d6eef845c1..d030306652a65016bbd7726a16ce31fa8927ea2e 100644
--- a/ch32v003fun/ch32v003fun.c
+++ b/ch32v003fun/ch32v003fun.c
@@ -810,11 +810,11 @@ asm volatile(
 #ifdef CPLUSPLUS
 	// Call __libc_init_array function
 "	call %0 \n\t"
-: : "i" (__libc_init_array)
+: : "i" (__libc_init_array) 
+: "a0", "a1", "a2", "a3", "a4", "a5", "t0", "t1", "t2", "memory"
 #else
-: :
+: : : "a0", "a1", "a2", "a3", "memory"
 #endif
-: "a0", "a1", "a2", "a3", "memory"
 );
 
 	SETUP_SYSTICK_HCLK
@@ -917,24 +917,61 @@ int putchar(int c)
 }
 #else
 
+
+void handle_debug_input( int numbytes, uint8_t * data ) __attribute__((weak));
+void handle_debug_input( int numbytes, uint8_t * data ) { }
+
+static void internal_handle_input( uint32_t * dmdata0 )
+{
+	uint32_t dmd0 = *dmdata0;
+	int bytes = (dmd0 & 0x3f) - 4;
+	if( bytes > 0 )
+	{
+		handle_debug_input( bytes, ((uint8_t*)dmdata0) + 1 );
+	}
+}
+
+
+void poll_input()
+{
+	uint32_t lastdmd = (*DMDATA0);
+ 	if( !(lastdmd & 0x80) )
+	{
+		internal_handle_input( (uint32_t*)DMDATA0 );
+		*DMDATA0 = 0x84; // Negative
+	}
+}
+
+
 //           MSB .... LSB
 // DMDATA0: char3 char2 char1 [status word]
 // where [status word] is:
 //   b7 = is a "printf" waiting?
-//   b0..b3 = # of bytes in printf (+4).  (4 or higher indicates a print of some kind)
+//   b0..b3 = # of bytes in printf (+4).  (5 or higher indicates a print of some kind)
+//     note: if b7 is 0 in reply, but b0..b3 have >=4 then we received data from host.
 
 int _write(int fd, const char *buf, int size)
 {
 	char buffer[4] = { 0 };
 	int place = 0;
+	uint32_t lastdmd;
 	uint32_t timeout = 160000; // Give up after ~40ms
+	if( size == 0 )
+	{
+		// Simply seeking input.
+		lastdmd = (*DMDATA0);
+		if( lastdmd ) internal_handle_input( (uint32_t*)DMDATA0 );
+	}
 	while( place < size )
 	{
 		int tosend = size - place;
 		if( tosend > 7 ) tosend = 7;
 
-		while( ((*DMDATA0) & 0x80) )
+		while( ( lastdmd = (*DMDATA0) ) & 0x80 )
 			if( timeout-- == 0 ) return place;
+
+		if( lastdmd ) internal_handle_input( (uint32_t*)DMDATA0 );
+
 		timeout = 160000;
 
 		int t = 3;
@@ -963,7 +1000,9 @@ int _write(int fd, const char *buf, int size)
 int putchar(int c)
 {
 	int timeout = 16000;
-	while( ((*DMDATA0) & 0x80) ) if( timeout-- == 0 ) return 0;
+	uint32_t lastdmd = 0;
+	while( (lastdmd = (*DMDATA0)) & 0x80 ) if( timeout-- == 0 ) return 0;
+	if( lastdmd ) internal_handle_input( (uint32_t*)DMDATA0 );
 	*DMDATA0 = 0x85 | ((const char)c<<8);
 	return 1;
 }
diff --git a/ch32v003fun/ch32v003fun.h b/ch32v003fun/ch32v003fun.h
index 1634bcb8e1d59b10b564fba9e94370168f4dddee..b65a5a22dd82026709ae97d6f7247cfdd6c8de94 100644
--- a/ch32v003fun/ch32v003fun.h
+++ b/ch32v003fun/ch32v003fun.h
@@ -1302,20 +1302,20 @@ typedef struct
 #define FLASH_STATR_EOP                         ((uint8_t)0x20) /* End of operation */
 
 /*******************  Bit definition for FLASH_CTLR register  *******************/
-#define FLASH_CTLR_PG                           ((uint16_t)0x0001)     /* Programming */
-#define FLASH_CTLR_PER                          ((uint16_t)0x0002)     /* Page Erase 1KByte*/
-#define FLASH_CTLR_MER                          ((uint16_t)0x0004)     /* Mass Erase */
-#define FLASH_CTLR_OPTPG                        ((uint16_t)0x0010)     /* Option Byte Programming */
-#define FLASH_CTLR_OPTER                        ((uint16_t)0x0020)     /* Option Byte Erase */
-#define FLASH_CTLR_STRT                         ((uint16_t)0x0040)     /* Start */
-#define FLASH_CTLR_LOCK                         ((uint16_t)0x0080)     /* Lock */
-#define FLASH_CTLR_OPTWRE                       ((uint16_t)0x0200)     /* Option Bytes Write Enable */
-#define FLASH_CTLR_ERRIE                        ((uint16_t)0x0400)     /* Error Interrupt Enable */
-#define FLASH_CTLR_EOPIE                        ((uint16_t)0x1000)     /* End of operation interrupt enable */
-#define FLASH_CTLR_PAGE_PG                      ((uint16_t)0x00010000) /* Page Programming 64Byte */
-#define FLASH_CTLR_PAGE_ER                      ((uint16_t)0x00020000) /* Page Erase 64Byte */
-#define FLASH_CTLR_BUF_LOAD                     ((uint16_t)0x00040000) /* Buffer Load */
-#define FLASH_CTLR_BUF_RST                      ((uint16_t)0x00080000) /* Buffer Reset */
+#define FLASH_CTLR_PG                           (0x0001)     /* Programming */
+#define FLASH_CTLR_PER                          (0x0002)     /* Page Erase 1KByte*/
+#define FLASH_CTLR_MER                          (0x0004)     /* Mass Erase */
+#define FLASH_CTLR_OPTPG                        (0x0010)     /* Option Byte Programming */
+#define FLASH_CTLR_OPTER                        (0x0020)     /* Option Byte Erase */
+#define FLASH_CTLR_STRT                         (0x0040)     /* Start */
+#define FLASH_CTLR_LOCK                         (0x0080)     /* Lock */
+#define FLASH_CTLR_OPTWRE                       (0x0200)     /* Option Bytes Write Enable */
+#define FLASH_CTLR_ERRIE                        (0x0400)     /* Error Interrupt Enable */
+#define FLASH_CTLR_EOPIE                        (0x1000)     /* End of operation interrupt enable */
+#define FLASH_CTLR_PAGE_PG                      (0x00010000) /* Page Programming 64Byte */
+#define FLASH_CTLR_PAGE_ER                      (0x00020000) /* Page Erase 64Byte */
+#define FLASH_CTLR_BUF_LOAD                     (0x00040000) /* Buffer Load */
+#define FLASH_CTLR_BUF_RST                      (0x00080000) /* Buffer Reset */
 
 /*******************  Bit definition for FLASH_ADDR register  *******************/
 #define FLASH_ADDR_FAR                          ((uint32_t)0xFFFFFFFF) /* Flash Address */
@@ -5080,8 +5080,76 @@ void WaitForDebuggerToAttach();
 // Just a definition to the internal _write function.
 int _write(int fd, const char *buf, int size);
 
+// Call this to busy-wait the polling of input.
+void poll_input();
+
+// Receiving bytes from host.  Override if you wish.
+void handle_debug_input( int numbytes, uint8_t * data );
+
 #endif
 
+// xw_ext.inc, thanks to @macyler, @jnk0le, @duk for this reverse engineering.
+
+/*
+Encoder for some of the proprietary 'XW' RISC-V instructions present on the QingKe RV32 processor.
+Examples:
+	XW_C_LBU(a3, a1, 27); // c.xw.lbu a3, 27(a1)
+	XW_C_SB(a0, s0, 13);  // c.xw.sb a0, 13(s0)
+
+	XW_C_LHU(a5, a5, 38); // c.xw.lhu a5, 38(a5)
+	XW_C_SH(a2, s1, 14);  // c.xw.sh a2, 14(s1)
+*/
+
+// Let us do some compile-time error checking.
+#define ASM_ASSERT(COND) .if (!(COND)); .err; .endif
+
+// Integer encodings of the possible compressed registers.
+#define C_s0 0
+#define C_s1 1
+#define C_a0 2
+#define C_a1 3
+#define C_a2 4
+#define C_a3 5
+#define C_a4 6
+#define C_a5 7
+
+// register to encoding
+#define REG2I(X) (C_ ## X)
+
+// XW opcodes
+#define XW_OP_LBUSP 0b1000000000000000
+#define XW_OP_STSP  0b1000000001000000
+
+#define XW_OP_LHUSP 0b1000000000100000
+#define XW_OP_SHSP  0b1000000001100000
+
+#define XW_OP_LBU   0b0010000000000000
+#define XW_OP_SB    0b1010000000000000
+
+#define XW_OP_LHU   0b0010000000000010
+#define XW_OP_SH    0b1010000000000010
+
+// The two different XW encodings supported at the moment.
+#define XW_ENCODE1(OP, R1, R2, IMM) ASM_ASSERT((IMM) >= 0 && (IMM) < 32); .2byte ((OP) | (REG2I(R1) << 2) | (REG2I(R2) << 7) | \
+	(((IMM) & 0b1) << 12) | (((IMM) & 0b110) << (5 - 1)) | (((IMM) & 0b11000) << (10 - 3)))
+
+#define XW_ENCODE2(OP, R1, R2, IMM) ASM_ASSERT((IMM) >= 0 && (IMM) < 32); .2byte ((OP) | (REG2I(R1) << 2) | (REG2I(R2) << 7) | \
+	(((IMM) & 0b11) << 5) | (((IMM) & 0b11100) << (10 - 2))
+
+// Compressed load byte, zero-extend result
+#define XW_C_LBU(RD, RS, IMM) XW_ENCODE1(XW_OP_LBU, RD, RS, IMM)
+
+// Compressed store byte
+#define XW_C_SB(RS1, RS2, IMM) XW_ENCODE1(XW_OP_SB, RS1, RS2, IMM)
+
+// Compressed load half, zero-extend result
+#define XW_C_LHU(RD, RS, IMM) ASM_ASSERT(((IMM) & 1) == 0); XW_ENCODE2(XW_OP_LHU, RD, RS, ((IMM) >> 1)))
+
+// Compressed store half
+#define XW_C_SH(RS1, RS2, IMM)  ASM_ASSERT(((IMM) & 1) == 0); XW_ENCODE2(XW_OP_SH, RS1, RS2, ((IMM) >> 1)))
+
+
+
 #ifdef __cplusplus
 };
 #endif
diff --git a/ch32v003fun/xw_ext.inc b/ch32v003fun/xw_ext.inc
deleted file mode 100644
index ca210702fc69d9d31b87d9f69ddaf004c17e0b0c..0000000000000000000000000000000000000000
--- a/ch32v003fun/xw_ext.inc
+++ /dev/null
@@ -1,57 +0,0 @@
-/*
-Encoder for some of the proprietary 'XW' RISC-V instructions present on the QingKe RV32 processor.
-Examples:
-	XW_C_LBU(a3, a1, 27); // c.xw.lbu a3, 27(a1)
-	XW_C_SB(a0, s0, 13);  // c.xw.sb a0, 13(s0)
-
-	XW_C_LHU(a5, a5, 38); // c.xw.lhu a5, 38(a5)
-	XW_C_SH(a2, s1, 14);  // c.xw.sh a2, 14(s1)
-*/
-
-// Let us do some compile-time error checking.
-#define ASM_ASSERT(COND) .if (!(COND)); .err; .endif
-
-// Integer encodings of the possible compressed registers.
-#define C_s0 0
-#define C_s1 1
-#define C_a0 2
-#define C_a1 3
-#define C_a2 4
-#define C_a3 5
-#define C_a4 6
-#define C_a5 7
-
-// register to encoding
-#define REG2I(X) (C_ ## X)
-
-// XW opcodes
-#define XW_OP_LBUSP 0b1000000000000000
-#define XW_OP_STSP  0b1000000001000000
-
-#define XW_OP_LHUSP 0b1000000000100000
-#define XW_OP_SHSP  0b1000000001100000
-
-#define XW_OP_LBU   0b0010000000000000
-#define XW_OP_SB    0b1010000000000000
-
-#define XW_OP_LHU   0b0010000000000010
-#define XW_OP_SH    0b1010000000000010
-
-// The two different XW encodings supported at the moment.
-#define XW_ENCODE1(OP, R1, R2, IMM) ASM_ASSERT((IMM) >= 0 && (IMM) < 32); .2byte ((OP) | (REG2I(R1) << 2) | (REG2I(R2) << 7) | \
-	(((IMM) & 0b1) << 12) | (((IMM) & 0b110) << (5 - 1)) | (((IMM) & 0b11000) << (10 - 3)))
-
-#define XW_ENCODE2(OP, R1, R2, IMM) ASM_ASSERT((IMM) >= 0 && (IMM) < 32); .2byte ((OP) | (REG2I(R1) << 2) | (REG2I(R2) << 7) | \
-	(((IMM) & 0b11) << 5) | (((IMM) & 0b11100) << (10 - 2))
-
-// Compressed load byte, zero-extend result
-#define XW_C_LBU(RD, RS, IMM) XW_ENCODE1(XW_OP_LBU, RD, RS, IMM)
-
-// Compressed store byte
-#define XW_C_SB(RS1, RS2, IMM) XW_ENCODE1(XW_OP_SB, RS1, RS2, IMM)
-
-// Compressed load half, zero-extend result
-#define XW_C_LHU(RD, RS, IMM) ASM_ASSERT(((IMM) & 1) == 0); XW_ENCODE2(XW_OP_LHU, RD, RS, ((IMM) >> 1)))
-
-// Compressed store half
-#define XW_C_SH(RS1, RS2, IMM)  ASM_ASSERT(((IMM) & 1) == 0); XW_ENCODE2(XW_OP_SH, RS1, RS2, ((IMM) >> 1)))
diff --git a/examples/GPIO/GPIO.c b/examples/GPIO/GPIO.c
index a761850c339f8e445fc38585064b58a92c73a6ec..e788f786a9b79ec84b7d197bb038d632ab33fc4e 100644
--- a/examples/GPIO/GPIO.c
+++ b/examples/GPIO/GPIO.c
@@ -1,13 +1,14 @@
-// 2023-06-07 recallmenot
+// 2023-06-21 recallmenot
 
 #define DEMO_GPIO_blink					1
+#define DEMO_GPIO_blink_port				0
 #define DEMO_GPIO_out					0
 #define DEMO_GPIO_in_btn				0
 #define DEMO_ADC_bragraph				0
 #define DEMO_PWM_dayrider				0
 
-#if ((DEMO_GPIO_blink + DEMO_GPIO_out + DEMO_GPIO_in_btn + DEMO_ADC_bragraph + DEMO_PWM_dayrider) > 1 \
-  || (DEMO_GPIO_blink + DEMO_GPIO_out + DEMO_GPIO_in_btn + DEMO_ADC_bragraph + DEMO_PWM_dayrider) < 1)
+#if ((DEMO_GPIO_blink + DEMO_GPIO_blink_port + DEMO_GPIO_out + DEMO_GPIO_in_btn + DEMO_ADC_bragraph + DEMO_PWM_dayrider) > 1 \
+  || (DEMO_GPIO_blink + DEMO_GPIO_blink_port + DEMO_GPIO_out + DEMO_GPIO_in_btn + DEMO_ADC_bragraph + DEMO_PWM_dayrider) < 1)
 #error "please enable ONE of the demos by setting it to 1 and the others to 0"
 #endif
 
@@ -27,56 +28,71 @@ int main() {
 	SystemInit48HSI();
 
 #if DEMO_GPIO_blink == 1
-	GPIO_portEnable(GPIO_port_C);
-	GPIO_portEnable(GPIO_port_D);
+	GPIO_port_enable(GPIO_port_C);
+	GPIO_port_enable(GPIO_port_D);
 	// GPIO D0 Push-Pull
 	GPIO_pinMode(GPIO_port_D, 0, GPIO_pinMode_O_pushPull, GPIO_Speed_10MHz);
 	// GPIO D4 Push-Pull
-	GPIO_pinMode(GPIO_port_D, 4, GPIO_pinMode_O_pushPull, GPIO_Speed_10MHz);
+	// P function suffix allows to specify port and pin in one parameter
+	GPIO_pinModeP(GPIO_pin_D4, GPIO_pinMode_O_pushPull, GPIO_Speed_10MHz);
 	// GPIO C0 Push-Pull
 	GPIO_pinMode(GPIO_port_C, 0, GPIO_pinMode_O_pushPull, GPIO_Speed_10MHz);
+#elif DEMO_GPIO_blink_port == 1
+	GPIO_port_enable(GPIO_port_C);
+	GPIO_port_pinMode(GPIO_port_C, GPIO_pinMode_O_pushPull, GPIO_Speed_10MHz);
 #elif DEMO_GPIO_out == 1
-	GPIO_portEnable(GPIO_port_C);
-	GPIO_portEnable(GPIO_port_D);
+	GPIO_port_enable(GPIO_port_C);
+	GPIO_port_enable(GPIO_port_D);
 	// GPIO D4 Push-Pull
 	GPIO_pinMode(GPIO_port_D, 4, GPIO_pinMode_O_pushPull, GPIO_Speed_10MHz);
 	// GPIO C0 - C7 Push-Pull
+	GPIO_port_pinMode(GPIO_port_C, GPIO_pinMode_O_pushPull, GPIO_Speed_10MHz);
+	/* faster & lighter than
 	for (int i = 0; i <= 7; i++) {
 		GPIO_pinMode(GPIO_port_C, i, GPIO_pinMode_O_pushPull, GPIO_Speed_10MHz);
 	}
+	*/
 #elif DEMO_GPIO_in_btn == 1
-	GPIO_portEnable(GPIO_port_C);
-	GPIO_portEnable(GPIO_port_D);
+	GPIO_port_enable(GPIO_port_C);
+	GPIO_port_enable(GPIO_port_D);
 	// GPIO D4 Push-Pull
 	GPIO_pinMode(GPIO_port_D, 3, GPIO_pinMode_I_pullUp, GPIO_SPEED_IN);
 	// GPIO C0 - C7 Push-Pull
+	GPIO_port_pinMode(GPIO_port_C, GPIO_pinMode_O_pushPull, GPIO_Speed_10MHz);
+	/* faster & lighter than
 	for (int i = 0; i <= 7; i++) {
 		GPIO_pinMode(GPIO_port_C, i, GPIO_pinMode_O_pushPull, GPIO_Speed_10MHz);
 	}
+	*/
 #elif DEMO_ADC_bragraph == 1
-	GPIO_portEnable(GPIO_port_C);
-	GPIO_portEnable(GPIO_port_D);
+	GPIO_port_enable(GPIO_port_C);
+	GPIO_port_enable(GPIO_port_D);
 	// GPIO D4 Push-Pull
 	GPIO_pinMode(GPIO_port_D, 4, GPIO_pinMode_O_pushPull, GPIO_Speed_10MHz);
 	// GPIO D6 analog in
 	GPIO_pinMode(GPIO_port_D, 6, GPIO_pinMode_I_analog, GPIO_SPEED_IN);
 	// GPIO C0 - C7 Push-Pull
-	for (int i = 0; i<= 7; i++) {
+	GPIO_port_pinMode(GPIO_port_C, GPIO_pinMode_O_pushPull, GPIO_Speed_10MHz);
+	/* faster & lighter than
+	for (int i = 0; i <= 7; i++) {
 		GPIO_pinMode(GPIO_port_C, i, GPIO_pinMode_O_pushPull, GPIO_Speed_10MHz);
 	}
+	*/
 	GPIO_ADCinit();
 #elif DEMO_PWM_dayrider == 1
 	//SetupUART( UART_BRR );
-	GPIO_portEnable(GPIO_port_C);
-	GPIO_portEnable(GPIO_port_D);
+	GPIO_port_enable(GPIO_port_C);
+	GPIO_port_enable(GPIO_port_D);
 	// GPIO D4 Push-Pull
 	GPIO_pinMode(GPIO_port_D, 4, GPIO_pinMode_O_pushPull, GPIO_Speed_10MHz);
 	// GPIO D6 analog in
 	GPIO_pinMode(GPIO_port_D, 6, GPIO_pinMode_I_analog, GPIO_SPEED_IN);
 	// GPIO C0 - C7 Push-Pull
-	for (int i = 0; i<= 7; i++) {
-		GPIO_pinMode(GPIO_port_C, i, GPIO_pinMode_O_pushPullMux, GPIO_Speed_50MHz);
+	/* faster & lighter than
+	for (int i = 0; i <= 7; i++) {
+		GPIO_pinMode(GPIO_port_C, i, GPIO_pinMode_O_pushPull, GPIO_Speed_10MHz);
 	}
+	*/
 	GPIO_tim2_map(GPIO_tim2_output_set_1__C5_C2_D2_C1);
 	GPIO_tim2_init();
 	GPIO_tim2_enableCH(4);
@@ -93,13 +109,24 @@ int main() {
 	while (1) {
 #if DEMO_GPIO_blink == 1
 		GPIO_digitalWrite(GPIO_port_D, 0, high);
-		GPIO_digitalWrite(GPIO_port_D, 4, high);
+		// P function suffix allows to specify port and pin in one parameter
+		GPIO_digitalWriteP(GPIO_pin_D4, high);
 		GPIO_digitalWrite(GPIO_port_C, 0, high);
 		Delay_Ms( 250 );
 		GPIO_digitalWrite(GPIO_port_D, 0, low);
-		GPIO_digitalWrite(GPIO_port_D, 4, low);
+		// P function suffix allows to specify port and pin in one parameter
+		GPIO_digitalWriteP(GPIO_pin_D4, low);
 		GPIO_digitalWrite(GPIO_port_C, 0, low);
 		Delay_Ms( 250 );
+#elif DEMO_GPIO_blink_port == 1
+		GPIO_port_digitalWrite(GPIO_port_C, 0b11111111);
+		Delay_Ms( 250 );
+		GPIO_port_digitalWrite(GPIO_port_C, 0b10101010);
+		Delay_Ms( 250 );
+		GPIO_port_digitalWrite(GPIO_port_C, 0b00000000);
+		Delay_Ms( 250 );
+		GPIO_port_digitalWrite(GPIO_port_C, 0b01010101);
+		Delay_Ms( 250 );
 #elif DEMO_GPIO_out == 1
 		GPIO_digitalWrite(GPIO_port_D, 4, low);
 		Delay_Ms(1000);
diff --git a/examples/debugprintfdemo/.vscode/settings.json b/examples/debugprintfdemo/.vscode/settings.json
index 5f7790e17f25ffb056231c9e95e5ecb65837a14c..e81e011860891e8f3d4696c8aaa7a31f4fd8171f 100644
--- a/examples/debugprintfdemo/.vscode/settings.json
+++ b/examples/debugprintfdemo/.vscode/settings.json
@@ -3,7 +3,7 @@
     "makefile.launchConfigurations": [
         {
             "cwd": "",
-            "sbinaryPath": "blink.elf",
+            "sbinaryPath": "debugprintfdemo.elf",
             "binaryArgs": []
         }
     ],
diff --git a/examples/debugprintfdemo/debugprintfdemo.c b/examples/debugprintfdemo/debugprintfdemo.c
index 41a153b03bc1fc0ec762eaf3b2be12f3b3d8c95c..df4195147bb4bbb0bd8a273741e432c5425fa805 100644
--- a/examples/debugprintfdemo/debugprintfdemo.c
+++ b/examples/debugprintfdemo/debugprintfdemo.c
@@ -1,11 +1,19 @@
 /* Small example showing how to use the SWIO programming pin to 
    do printf through the debug interface */
 
+#define SYSTEM_CORE_CLOCK 48000000
 #include "ch32v003fun.h"
 #include <stdio.h>
 
 uint32_t count;
 
+int last = 0;
+void handle_debug_input( int numbytes, uint8_t * data )
+{
+	last = data[0];
+	count += numbytes;
+}
+
 int main()
 {
 	SystemInit48HSI();
@@ -31,9 +39,14 @@ int main()
 		GPIOD->BSHR = 1 | (1<<4);	 // Turn on GPIOs
 		GPIOC->BSHR = 1;
 		printf( "+%lu\n", count++ );
+		Delay_Ms(100);
+		int i;
+		for( i = 0; i < 10000; i++ )
+			poll_input();
 		GPIOD->BSHR = (1<<16) | (1<<(16+4)); // Turn off GPIODs
 		GPIOC->BSHR = (1<<16);
-		printf( "-%lu\n", count++ );
+		printf( "-%lu[%c]\n", count++, last );
+		Delay_Ms(100);
 	}
 }
 
diff --git a/examples/flashtest/Makefile b/examples/flashtest/Makefile
new file mode 100644
index 0000000000000000000000000000000000000000..89a4c4c409495584772ed6a271dc715d6c6b2143
--- /dev/null
+++ b/examples/flashtest/Makefile
@@ -0,0 +1,9 @@
+all : flash
+
+TARGET:=flashtest
+
+include ../../ch32v003fun/ch32v003fun.mk
+
+flash : cv_flash
+clean : cv_clean
+
diff --git a/examples/flashtest/flashtest.c b/examples/flashtest/flashtest.c
new file mode 100644
index 0000000000000000000000000000000000000000..435dcc1861cfedab59d74c900fed40efcf8e0811
--- /dev/null
+++ b/examples/flashtest/flashtest.c
@@ -0,0 +1,103 @@
+// DOES NOT WORK HALP!!!!!!!!!!!!!!
+
+#define SYSTEM_CORE_CLOCK 48000000
+#define SYSTICK_USE_HCLK
+
+#include "ch32v003fun.h"
+#include <stdio.h>
+
+
+int main()
+{
+	int start;
+	int stop;
+
+	SETUP_SYSTICK_HCLK
+
+	SystemInit48HSI();
+	SetupDebugPrintf();
+
+	Delay_Ms(100);
+
+	printf( "Starting\n" );
+
+	// Unkock flash - be aware you need extra stuff for the bootloader.
+	FLASH->KEYR = 0x45670123;
+	FLASH->KEYR = 0xCDEF89AB;
+
+	// For option bytes.
+//	FLASH->OBKEYR = 0x45670123;
+//	FLASH->OBKEYR = 0xCDEF89AB;
+
+	FLASH->MODEKEYR = 0x45670123;
+	FLASH->MODEKEYR = 0xCDEF89AB;
+
+	printf( "FLASH->CTLR = %08lx\n", FLASH->CTLR );
+	if( FLASH->CTLR & 0x8080 ) 
+	{
+		printf( "Flash still locked\n" );
+		while(1);
+	}
+
+	uint32_t * ptr = (uint32_t*)0x08003700;
+	printf( "Memory at: %p: %08lx %08lx\n", ptr, ptr[0], ptr[1] );
+
+
+	printf( "FLASH->CTLR = %08lx\n", FLASH->CTLR );
+
+	//Erase Page
+	FLASH->CTLR = CR_PAGE_ER;
+	FLASH->ADDR = (intptr_t)ptr;
+	FLASH->CTLR = CR_STRT_Set | CR_PAGE_ER;
+	start = SysTick->CNT;
+	while( FLASH->STATR & FLASH_STATR_BSY );  // Takes about 3ms.
+	stop = SysTick->CNT;
+
+	printf( "FLASH->STATR = %08lx -> %d cycles for page erase\n", FLASH->STATR, stop - start );
+	printf( "Erase complete\n" );
+
+
+	printf( "Memory at %p: %08lx %08lx\n", ptr, ptr[0], ptr[1] );
+
+	// Clear buffer and prep for flashing.
+	FLASH->CTLR = CR_PAGE_PG;  // synonym of FTPG.
+	FLASH->CTLR = CR_BUF_RST | CR_PAGE_PG;
+	FLASH->ADDR = (intptr_t)ptr;  // This can actually happen about anywhere toward the end here.
+
+
+	// Note: It takes about 6 clock cycles for this to finish.
+	start = SysTick->CNT;
+	while( FLASH->STATR & FLASH_STATR_BSY );  // No real need for this.
+	stop = SysTick->CNT;
+	printf( "FLASH->STATR = %08lx -> %d cycles for buffer reset\n", FLASH->STATR, stop - start );
+
+
+	int i;
+	start = SysTick->CNT;
+	for( i = 0; i < 16; i++ )
+	{
+		ptr[i] = 0xabcd1234 + i; //Write to the memory
+		FLASH->CTLR = CR_PAGE_PG | FLASH_CTLR_BUF_LOAD; // Load the buffer.
+		while( FLASH->STATR & FLASH_STATR_BSY );  // Only needed if running from RAM.
+	}
+	stop = SysTick->CNT;
+	printf( "Write: %d cycles for writing data in\n", stop - start );
+
+	// Actually write the flash out. (Takes about 3ms)
+	FLASH->CTLR = CR_PAGE_PG|CR_STRT_Set;
+
+	start = SysTick->CNT;
+	while( FLASH->STATR & FLASH_STATR_BSY );
+	stop = SysTick->CNT;
+	printf( "FLASH->STATR = %08lx -> %d cycles for page write\n", FLASH->STATR, stop - start );
+
+	printf( "FLASH->STATR = %08lx\n", FLASH->STATR );
+
+	printf( "Memory at: %08lx: %08lx %08lx\n", (uint32_t)ptr, ptr[0], ptr[1] );
+
+	for( i = 0; i < 16; i++ )
+		printf( "%08lx ", ptr[i] );
+	printf( "\n" );
+	while(1);
+}
+
diff --git a/examples/sandbox/sandbox.c b/examples/sandbox/sandbox.c
deleted file mode 100644
index 33b76fdefb866b36fdd030c949f2710a7ad3fee0..0000000000000000000000000000000000000000
--- a/examples/sandbox/sandbox.c
+++ /dev/null
@@ -1,52 +0,0 @@
-/* Small example showing how to use the SWIO programming pin to 
-   do printf through the debug interface */
-
-#include "ch32v003fun.h"
-#include <stdio.h>
-
-uint32_t count;
-
-
-
-// Tell the compiler to put this code in the .data section.  That
-// will cause the startup code to copy it from flash into RAM where
-// it can be easily modified at runtime.
-void SRAMCode( ) __attribute__(( section(".data"))) __attribute__((noinline)) __attribute__((noreturn));
-void SRAMCode( )
-{
-	asm volatile( 
-"li a0, 0x40011410\n"
-"li a1, (1 | (1<<4))\n"
-"li a2, (1 | (1<<4))<<16\n"
-"1: c.sw a1, 0(a0)\n"
-"   c.sw a2, 0(a0)\n"
-"   j 1b\n" );
-    __builtin_unreachable();
-}
-
-int main()
-{
-	SystemInit48HSI();
-	SetupDebugPrintf();
-
-	// Boost CPU supply.
-	EXTEN->EXTEN_CTR = EXTEN_LDO_TRIM;
-
-	// Enable GPIOs
-	RCC->APB2PCENR |= RCC_APB2Periph_GPIOD | RCC_APB2Periph_GPIOC;
-
-	// GPIO D0 Push-Pull
-	GPIOD->CFGLR &= ~(0xf<<(4*0));
-	GPIOD->CFGLR |= (GPIO_Speed_10MHz | GPIO_CNF_OUT_PP)<<(4*0);
-
-	// GPIO D4 Push-Pull
-	GPIOD->CFGLR &= ~(0xf<<(4*4));
-	GPIOD->CFGLR |= (GPIO_Speed_10MHz | GPIO_CNF_OUT_PP)<<(4*4);
-
-	// GPIO C0 Push-Pull
-	GPIOC->CFGLR &= ~(0xf<<(4*0));
-	GPIOC->CFGLR |= (GPIO_Speed_10MHz | GPIO_CNF_OUT_PP)<<(4*0);
-
-	SRAMCode();
-}
-
diff --git a/examples/template/.vscode/c_cpp_properties.json b/examples/template/.vscode/c_cpp_properties.json
new file mode 100644
index 0000000000000000000000000000000000000000..b52c5f523093a4d5db08d9b49089335ef1749a04
--- /dev/null
+++ b/examples/template/.vscode/c_cpp_properties.json
@@ -0,0 +1,20 @@
+{
+    "configurations": [
+        {
+            "name": "Linux",
+            "includePath": [
+                "${workspaceFolder}/**",
+                "${workspaceFolder}/../../ch32v003fun"
+            ],
+            "defines": [],
+            "compilerPath": "/usr/bin/clang",
+            "cppStandard": "c++14",
+            "intelliSenseMode": "linux-clang-x64",
+            "compilerArgs": [
+                "-DCH32V003FUN_BASE"
+            ],
+            "configurationProvider": "ms-vscode.makefile-tools"
+        }
+    ],
+    "version": 4
+}
\ No newline at end of file
diff --git a/examples/template/.vscode/launch.json b/examples/template/.vscode/launch.json
new file mode 100644
index 0000000000000000000000000000000000000000..b8cdedfe59b331ebac01e49813c7edf6120c7056
--- /dev/null
+++ b/examples/template/.vscode/launch.json
@@ -0,0 +1,39 @@
+{
+	"configurations": [
+		{
+			"name": "GDB Debug Target",
+			"type": "cppdbg",
+			"request": "launch",
+			"program": "template.elf",
+			"args": [],
+			"stopAtEntry": true,
+			"cwd": "${workspaceFolder}",
+			"environment": [],
+			"externalConsole": false,
+			"MIMode": "gdb",
+			"deploySteps": [
+				{
+					"type": "shell",
+					"continueOn": "GDBServer",
+					"command": "make --directory=${workspaceFolder} closechlink flash gdbserver"
+				},
+			],
+			"setupCommands": [
+				{
+					"description": "Enable pretty-printing for gdb",
+					"text": "-enable-pretty-printing",
+					"ignoreFailures": true
+				}
+			],
+			"miDebuggerPath": "gdb-multiarch",
+			"miDebuggerServerAddress": "127.0.0.1:2000"
+		},
+		{
+			"name": "Run Only (In Terminal)",
+			"type": "node",
+			"request": "launch",
+			"program": "",
+			"preLaunchTask": "run_flash_and_gdbserver",
+		}	
+		]
+}
diff --git a/examples/template/.vscode/settings.json b/examples/template/.vscode/settings.json
new file mode 100644
index 0000000000000000000000000000000000000000..5f7790e17f25ffb056231c9e95e5ecb65837a14c
--- /dev/null
+++ b/examples/template/.vscode/settings.json
@@ -0,0 +1,15 @@
+{
+    "cmake.configureOnOpen": false,
+    "makefile.launchConfigurations": [
+        {
+            "cwd": "",
+            "sbinaryPath": "blink.elf",
+            "binaryArgs": []
+        }
+    ],
+    "editor.insertSpaces": false,
+    "editor.tabSize": 4,
+    "files.associations": {
+        "ch32v003fun.h": "c"
+    }
+}
diff --git a/examples/template/.vscode/tasks.json b/examples/template/.vscode/tasks.json
new file mode 100644
index 0000000000000000000000000000000000000000..d086ce2071367d63333095ab57bba8615df8f184
--- /dev/null
+++ b/examples/template/.vscode/tasks.json
@@ -0,0 +1,56 @@
+{
+	"version": "2.0.0",
+	"tasks": [
+		{
+			"type": "shell",
+			"label": "flash",
+			"presentation": {
+				"echo": true,
+				"focus": false,
+				"group": "build",
+				"panel": "shared",
+				"showReuseMessage" : false
+			},
+			"command": "make closechlink flash",
+		},
+		{
+			"type": "shell",
+			"label": "run_flash_and_gdbserver",
+			"command": "make closechlink flash gdbserver",
+
+			"presentation": {
+				"echo": true,
+				"focus": false,
+				"group": "build",
+				"panel": "shared",
+				"close": true,
+				"showReuseMessage" : false
+			},
+
+			"isBackground": true,
+			"options": {
+				"cwd": "${workspaceFolder}",
+			},
+			"runOptions": {
+				"instanceLimit": 2,
+			},			 
+			"group": "build",
+			"problemMatcher": {
+				"pattern": [
+					{
+						"regexp": ".",
+						"file": 1,
+						"location": 2,
+						"message": 3
+					}
+				],
+
+				"background": {
+					"activeOnStart": false,
+					"beginsPattern": "^.*Image written.*",
+					"endsPattern": "^.*GDBServer*"
+				}
+			},
+		}
+	]
+}
diff --git a/examples/sandbox/Makefile b/examples/template/Makefile
similarity index 84%
rename from examples/sandbox/Makefile
rename to examples/template/Makefile
index b3f8b71aad1aab9cde12d9a8a36662cac06fc31c..64198f4a1131d865edcb0615105f2c0cccd66005 100644
--- a/examples/sandbox/Makefile
+++ b/examples/template/Makefile
@@ -1,6 +1,6 @@
 all : flash
 
-TARGET:=sandbox
+TARGET:=template
 
 include ../../ch32v003fun/ch32v003fun.mk
 
diff --git a/examples/template/template.c b/examples/template/template.c
new file mode 100644
index 0000000000000000000000000000000000000000..d81e4eb93d7f614b29f42e96753ef74f604a2c13
--- /dev/null
+++ b/examples/template/template.c
@@ -0,0 +1,41 @@
+/* Template app on which you can build your own. */
+#define SYSTEM_CORE_CLOCK 48000000
+
+#include "ch32v003fun.h"
+#include <stdio.h>
+
+uint32_t count;
+
+int main()
+{
+	SystemInit48HSI();
+	SetupDebugPrintf();
+
+	// Enable GPIOs
+	RCC->APB2PCENR |= RCC_APB2Periph_GPIOD | RCC_APB2Periph_GPIOC;
+
+	// GPIO D0 Push-Pull
+	GPIOD->CFGLR &= ~(0xf<<(4*0));
+	GPIOD->CFGLR |= (GPIO_Speed_10MHz | GPIO_CNF_OUT_PP)<<(4*0);
+
+	// GPIO D4 Push-Pull
+	GPIOD->CFGLR &= ~(0xf<<(4*4));
+	GPIOD->CFGLR |= (GPIO_Speed_10MHz | GPIO_CNF_OUT_PP)<<(4*4);
+
+	// GPIO C0 Push-Pull
+	GPIOC->CFGLR &= ~(0xf<<(4*0));
+	GPIOC->CFGLR |= (GPIO_Speed_10MHz | GPIO_CNF_OUT_PP)<<(4*0);
+
+	while(1)
+	{
+		GPIOD->BSHR = 1 | (1<<4);	 // Turn on GPIOs
+		GPIOC->BSHR = 1;
+		printf( "+%lu\n", count++ );
+		Delay_Ms(250);
+		GPIOD->BSHR = (1<<16) | (1<<(16+4)); // Turn off GPIODs
+		GPIOC->BSHR = (1<<16);
+		printf( "-%lu\n", count++ );
+		Delay_Ms(250);
+	}
+}
+
diff --git a/extralibs/ch32v003_GPIO_branchless.h b/extralibs/ch32v003_GPIO_branchless.h
index 728ee001044175d82084a7c25c478f5723630345..4dc4264b84ff373c844baab8b40833dc4333f5fd 100644
--- a/extralibs/ch32v003_GPIO_branchless.h
+++ b/extralibs/ch32v003_GPIO_branchless.h
@@ -32,6 +32,16 @@ digitalWrite_lo
 digitalWrite_hi
 digitalRead
 
+additionally, there are functions to operate on an entire port at once
+this can be useful where setting all pins one by one would be too inefficient / unnecessary
+an example: https://www.youtube.com/watch?v=cy6o8TrDUFU
+GPIO_port_digitalWrite
+GPIO_port_digitalRead
+
+function variants with the `P` suffix take a GPIO_pin_Pn instead of a combination of GPIO_port_P and pin number n
+example:
+`GPIO_port_D, 4` becomes `GPIO_pin_D4` when using the function with the `P` suffix
+
 
 
 analog-to-digital usage is almost Arduino-like:
@@ -96,6 +106,28 @@ enum GPIO_port_n {
 	GPIO_port_D = 0b11,
 };
 
+// pin synonyms, use is not mandatory, you can either use
+// 	these with the *P functions or
+// 	specify "GPIO_port_n, N" with the regular functions
+#define GPIO_pin_A1	GPIO_port_A, 1
+#define GPIO_pin_A2	GPIO_port_A, 2
+#define GPIO_pin_C0	GPIO_port_C, 0
+#define GPIO_pin_C1	GPIO_port_C, 1
+#define GPIO_pin_C2	GPIO_port_C, 2
+#define GPIO_pin_C3	GPIO_port_C, 3
+#define GPIO_pin_C4	GPIO_port_C, 4
+#define GPIO_pin_C5	GPIO_port_C, 5
+#define GPIO_pin_C6	GPIO_port_C, 6
+#define GPIO_pin_C7	GPIO_port_C, 7
+#define GPIO_pin_D0	GPIO_port_D, 0
+#define GPIO_pin_D1	GPIO_port_D, 1
+#define GPIO_pin_D2	GPIO_port_D, 2
+#define GPIO_pin_D3	GPIO_port_D, 3
+#define GPIO_pin_D4	GPIO_port_D, 4
+#define GPIO_pin_D5	GPIO_port_D, 5
+#define GPIO_pin_D6	GPIO_port_D, 6
+#define GPIO_pin_D7	GPIO_port_D, 7
+
 enum GPIO_pinModes {
 	GPIO_pinMode_I_floating,
 	GPIO_pinMode_I_pullUp,
@@ -158,15 +190,23 @@ enum GPIO_tim2_output_sets {
 // most functions have been reduced to function-like macros, actual definitions downstairs
 
 // setup
-#define GPIO_portEnable(GPIO_port_n)
+#define GPIO_port_enable(GPIO_port_n)
 #define GPIO_pinMode(GPIO_port_n, pin, pinMode, GPIO_Speed)
+#define GPIO_pinModeP(GPIO_pin_Pn, pinMode, GPIO_Speed)
 
 // digital
 #define GPIO_digitalWrite_hi(GPIO_port_n, pin)
+#define GPIO_digitalWrite_hiP(GPIO_pin_Pn)
 #define GPIO_digitalWrite_lo(GPIO_port_n, pin)
+#define GPIO_digitalWrite_loP(GPIO_pin_Pn)
 #define GPIO_digitalWrite(GPIO_port_n, pin, lowhigh)
+#define GPIO_digitalWriteP(GPIO_pin_Pn, lowhigh)
 #define GPIO_digitalWrite_branching(GPIO_port_n, pin, lowhigh)
+#define GPIO_digitalWrite_branchingP(GPIO_pin_Pn, lowhigh)
 #define GPIO_digitalRead(GPIO_port_n, pin)
+#define GPIO_digitalReadP(GPIO_pin_Pn)
+#define GPIO_port_digitalWrite(GPIO_port_n, byte)
+#define GPIO_port_digitalRead(GPIO_port_n)
 
 // analog to digital
 static inline void GPIO_ADCinit();
@@ -241,6 +281,17 @@ static inline void GPIO_tim2_init();
 #define GPIO_pinMode_set_PUPD_GPIO_pinMode_O_pushPullMux(GPIO_port_n, pin)
 #define GPIO_pinMode_set_PUPD_GPIO_pinMode_O_openDrainMux(GPIO_port_n, pin)
 
+#define GPIO_port_pinMode_set_PUPD2(GPIO_pinMode, GPIO_port_n)			GPIO_port_pinMode_set_PUPD_##GPIO_pinMode(GPIO_port_n)
+#define GPIO_port_pinMode_set_PUPD(GPIO_pinMode, GPIO_port_n)			GPIO_port_pinMode_set_PUPD2(GPIO_pinMode, GPIO_port_n)
+#define GPIO_port_pinMode_set_PUPD_GPIO_pinMode_I_floating(GPIO_port_n)
+#define GPIO_port_pinMode_set_PUPD_GPIO_pinMode_I_pullUp(GPIO_port_n)		GPIO_port_n_to_GPIOx(GPIO_port_n)->OUTDR = 0b11111111
+#define GPIO_port_pinMode_set_PUPD_GPIO_pinMode_I_pullDown(GPIO_port_n)		GPIO_port_n_to_GPIOx(GPIO_port_n)->OUTDR = 0b00000000
+#define GPIO_port_pinMode_set_PUPD_GPIO_pinMode_I_analog(GPIO_port_n)
+#define GPIO_port_pinMode_set_PUPD_GPIO_pinMode_O_pushPull(GPIO_port_n)
+#define GPIO_port_pinMode_set_PUPD_GPIO_pinMode_O_openDrain(GPIO_port_n)
+#define GPIO_port_pinMode_set_PUPD_GPIO_pinMode_O_pushPullMux(GPIO_port_n)
+#define GPIO_port_pinMode_set_PUPD_GPIO_pinMode_O_openDrainMux(GPIO_port_n)
+
 #if !defined(GPIO_ADC_MUX_DELAY)
 #define GPIO_ADC_MUX_DELAY 200
 #endif
@@ -272,8 +323,26 @@ static inline void GPIO_tim2_init();
 //######## small function definitions, static inline
 
 
-#undef GPIO_portEnable
-#define GPIO_portEnable(GPIO_port_n) RCC->APB2PCENR |= GPIO_port_n_to_RCC_APB2Periph(GPIO_port_n);
+#undef GPIO_port_enable
+#define GPIO_port_enable(GPIO_port_n) RCC->APB2PCENR |= GPIO_port_n_to_RCC_APB2Periph(GPIO_port_n);
+
+#define GPIO_port_pinMode(GPIO_port_n, pinMode, GPIO_Speed) ({								\
+	GPIO_port_n_to_GPIOx(GPIO_port_n)->CFGLR =	(GPIO_pinMode_to_CFG(pinMode, GPIO_Speed) << (4 * 0)) | 	\
+							(GPIO_pinMode_to_CFG(pinMode, GPIO_Speed) << (4 * 1)) | 	\
+							(GPIO_pinMode_to_CFG(pinMode, GPIO_Speed) << (4 * 2)) | 	\
+							(GPIO_pinMode_to_CFG(pinMode, GPIO_Speed) << (4 * 3)) | 	\
+							(GPIO_pinMode_to_CFG(pinMode, GPIO_Speed) << (4 * 4)) | 	\
+							(GPIO_pinMode_to_CFG(pinMode, GPIO_Speed) << (4 * 5)) | 	\
+							(GPIO_pinMode_to_CFG(pinMode, GPIO_Speed) << (4 * 6)) |		\
+							(GPIO_pinMode_to_CFG(pinMode, GPIO_Speed) << (4 * 7));		\
+	GPIO_port_pinMode_set_PUPD(pinMode, GPIO_port_n);								\
+})
+
+#undef GPIO_port_digitalWrite
+#define GPIO_port_digitalWrite(GPIO_port_n, byte)	GPIO_port_n_to_GPIOx(GPIO_port_n)->OUTDR = byte
+
+#undef GPIO_port_digitalRead
+#define GPIO_port_digitalRead(GPIO_port_n)		(GPIO_port_n_to_GPIOx(GPIO_port_n)->INDR & 0b11111111)
 
 #undef GPIO_pinMode
 #define GPIO_pinMode(GPIO_port_n, pin, pinMode, GPIO_Speed) ({							\
@@ -281,26 +350,36 @@ static inline void GPIO_tim2_init();
 	GPIO_port_n_to_GPIOx(GPIO_port_n)->CFGLR |= (GPIO_pinMode_to_CFG(pinMode, GPIO_Speed) << (4 * pin));	\
 	GPIO_pinMode_set_PUPD(pinMode, GPIO_port_n, pin);							\
 })
+#undef GPIO_pinModeP
+#define GPIO_pinModeP(GPIO_pin_Pn, pinMode, GPIO_Speed)			GPIO_pinMode(GPIO_pin_Pn, pinMode, GPIO_Speed)
 
 #undef GPIO_digitalWrite_hi
-#define GPIO_digitalWrite_hi(GPIO_port_n, pin)		GPIO_port_n_to_GPIOx(GPIO_port_n)->BSHR = (1 << pin)
+#define GPIO_digitalWrite_hi(GPIO_port_n, pin)				GPIO_port_n_to_GPIOx(GPIO_port_n)->BSHR = (1 << pin)
+#undef GPIO_digitalWrite_hiP
+#define GPIO_digitalWrite_hiP(GPIO_pin_Pn)				GPIO_digitalWrite_hi(GPIO_pin_Pn)
 #undef GPIO_digitalWrite_lo
-#define GPIO_digitalWrite_lo(GPIO_port_n, pin)		GPIO_port_n_to_GPIOx(GPIO_port_n)->BSHR = (1 << (pin + 16))
+#define GPIO_digitalWrite_lo(GPIO_port_n, pin)				GPIO_port_n_to_GPIOx(GPIO_port_n)->BSHR = (1 << (pin + 16))
+#undef GPIO_digitalWrite_loP
+#define GPIO_digitalWrite_loP(GPIO_pin_Pn)				GPIO_digitalWrite_lo(GPIO_pin_Pn)
 
 #undef GPIO_digitalWrite
-#define GPIO_digitalWrite2(GPIO_port_n, pin, lowhigh)	GPIO_digitalWrite_##lowhigh(GPIO_port_n, pin)
-#define GPIO_digitalWrite(GPIO_port_n, pin, lowhigh)	GPIO_digitalWrite2(GPIO_port_n, pin, lowhigh)
-#define GPIO_digitalWrite_low(GPIO_port_n, pin)		GPIO_digitalWrite_lo(GPIO_port_n, pin)
-#define GPIO_digitalWrite_0(GPIO_port_n, pin)		GPIO_digitalWrite_lo(GPIO_port_n, pin)
-#define GPIO_digitalWrite_high(GPIO_port_n, pin)	GPIO_digitalWrite_hi(GPIO_port_n, pin)
-#define GPIO_digitalWrite_1(GPIO_port_n, pin)		GPIO_digitalWrite_hi(GPIO_port_n, pin)
+#define GPIO_digitalWrite(GPIO_port_n, pin, lowhigh)			GPIO_digitalWrite_##lowhigh(GPIO_port_n, pin)
+#undef GPIO_digitalWriteP
+#define GPIO_digitalWriteP(GPIO_pin_Pn, lowhigh)			GPIO_digitalWrite(GPIO_pin_Pn, lowhigh)
+#define GPIO_digitalWrite_low(GPIO_port_n, pin)				GPIO_digitalWrite_lo(GPIO_port_n, pin)
+#define GPIO_digitalWrite_0(GPIO_port_n, pin)				GPIO_digitalWrite_lo(GPIO_port_n, pin)
+#define GPIO_digitalWrite_high(GPIO_port_n, pin)			GPIO_digitalWrite_hi(GPIO_port_n, pin)
+#define GPIO_digitalWrite_1(GPIO_port_n, pin)				GPIO_digitalWrite_hi(GPIO_port_n, pin)
 
 #undef GPIO_digitalWrite_branching
 #define GPIO_digitalWrite_branching(GPIO_port_n, pin, lowhigh)		(lowhigh ? GPIO_digitalWrite_hi(GPIO_port_n, pin) : GPIO_digitalWrite_lo(GPIO_port_n, pin))
+#undef GPIO_digitalWrite_branchingP
+#define GPIO_digitalWrite_branchingP(GPIO_pin_Pn, lowhigh)		GPIO_digitalWrite_branching(GPIO_pin_Pn, lowhigh)
 
 #undef GPIO_digitalRead
-#define GPIO_digitalRead(GPIO_port_n, pin)	 	((GPIO_port_n_to_GPIOx(GPIO_port_n)->INDR >> pin) & 0b1)
-
+#define GPIO_digitalRead(GPIO_port_n, pin)	 			((GPIO_port_n_to_GPIOx(GPIO_port_n)->INDR >> pin) & 0b1)
+#undef GPIO_digitalReadP
+#define GPIO_digitalReadP(GPIO_pin_Pn)			 		GPIO_digitalRead(GPIO_pin_Pn)
 
 #undef GPIO_ADC_set_sampletime
 // 0:7 => 3/9/15/30/43/57/73/241 cycles
@@ -462,15 +541,11 @@ static inline void GPIO_tim2_init() {
 	TIM2->CCER |= (TIM_OutputState_Enable ) << (4 * (channel - 1));		\
 })
 
-#define GPIO_timer_CVR(channel)		CONCAT_INDIRECT(CH, CONCAT_INDIRECT(channel, CVR))
+#define GPIO_timer_CVR(channel)				CONCAT_INDIRECT(CH, CONCAT_INDIRECT(channel, CVR))
 
 #undef GPIO_tim1_analogWrite
-#define GPIO_tim1_analogWrite(channel, value) ({				\
-	TIM1->GPIO_timer_CVR(channel) = value;					\
-})
+#define GPIO_tim1_analogWrite(channel, value) 		TIM1->GPIO_timer_CVR(channel) = value;
 #undef GPIO_tim2_analogWrite
-#define GPIO_tim2_analogWrite(channel, value) ({				\
-	TIM2->GPIO_timer_CVR(channel) = value;					\
-})
+#define GPIO_tim2_analogWrite(channel, value)		TIM2->GPIO_timer_CVR(channel) = value;
 
 #endif // CH32V003_GPIO_BR_H
diff --git a/minichlink/Makefile b/minichlink/Makefile
index dc6c8cc949c0dbbe024a5b31ccadd512d70b9455..964341c11b51b66accf732be82639d887df7c0cd 100644
--- a/minichlink/Makefile
+++ b/minichlink/Makefile
@@ -1,7 +1,7 @@
 TOOLS:=minichlink minichlink.so
 
 CFLAGS:=-O0 -g3 -Wall
-C_S:=minichlink.c pgm-wch-linke.c pgm-esp32s2-ch32xx.c nhc-link042.c minichgdb.c
+C_S:=minichlink.c pgm-wch-linke.c pgm-esp32s2-ch32xx.c nhc-link042.c ardulink.c serial_dev.c pgm-b003fun.c minichgdb.c
 
 # General Note: To use with GDB, gdb-multiarch
 # gdb-multilib {file}
diff --git a/minichlink/ardulink.c b/minichlink/ardulink.c
new file mode 100644
index 0000000000000000000000000000000000000000..24819c138af68a21495730fe5d7b61675114e9b7
--- /dev/null
+++ b/minichlink/ardulink.c
@@ -0,0 +1,174 @@
+#include <stdint.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include "serial_dev.h"
+#include "minichlink.h"
+
+void * TryInit_Ardulink(const init_hints_t*);
+
+static int ArdulinkWriteReg32(void * dev, uint8_t reg_7_bit, uint32_t command);
+static int ArdulinkReadReg32(void * dev, uint8_t reg_7_bit, uint32_t * commandresp);
+static int ArdulinkFlushLLCommands(void * dev);
+static int ArdulinkDelayUS(void * dev, int microseconds);
+static int ArdulinkControl3v3(void * dev, int power_on);
+static int ArdulinkExit(void * dev);
+
+typedef struct {
+	struct ProgrammerStructBase psb;
+	serial_dev_t serial;
+} ardulink_ctx_t;
+
+int ArdulinkWriteReg32(void * dev, uint8_t reg_7_bit, uint32_t command)
+{
+	uint8_t buf[6];
+	buf[0] = 'w';
+	buf[1] = reg_7_bit;
+
+	//fprintf(stderr, "WriteReg32: 0x%02x = 0x%08x\n", reg_7_bit, command);
+
+	buf[2] = command & 0xff;
+	buf[3] = (command >> 8) & 0xff;
+	buf[4] = (command >> 16) & 0xff;
+	buf[5] = (command >> 24) & 0xff;
+
+	if (serial_dev_write(&((ardulink_ctx_t*)dev)->serial, buf, 6) == -1)
+		return -errno;
+
+	if (serial_dev_read(&((ardulink_ctx_t*)dev)->serial, buf, 1) == -1)
+		return -errno;
+
+	return buf[0] == '+' ? 0 : -71; // EPROTO
+}
+
+int ArdulinkReadReg32(void * dev, uint8_t reg_7_bit, uint32_t * commandresp)
+{
+	uint8_t buf[4];
+	buf[0] = 'r';
+	buf[1] = reg_7_bit;
+
+	if (serial_dev_write(&((ardulink_ctx_t*)dev)->serial, buf, 2) == -1)
+		return -errno;
+
+	if (serial_dev_read(&((ardulink_ctx_t*)dev)->serial, buf, 4) == -1)
+		return -errno;
+
+	*commandresp = (uint32_t)buf[0] | (uint32_t)buf[1] << 8 | \
+		(uint32_t)buf[2] << 16 | (uint32_t)buf[3] << 24;
+
+	//fprintf(stderr, "ReadReg32: 0x%02x = 0x%08x\n", reg_7_bit, *commandresp);
+
+	return 0;
+}
+
+int ArdulinkFlushLLCommands(void * dev)
+{
+	return 0;
+}
+
+int ArdulinkControl3v3(void * dev, int power_on) {
+	char c;
+
+	fprintf(stderr, "Ardulink: target power %d\n", power_on);
+
+	c = power_on ? 'p' : 'P';
+	if (serial_dev_write(&((ardulink_ctx_t*)dev)->serial, &c, 1) == -1)
+		return -errno;
+
+	if (serial_dev_read(&((ardulink_ctx_t*)dev)->serial, &c, 1) == -1)
+		return -errno;
+
+	if (c != '+')
+		return -71; // EPROTO
+
+	MCF.DelayUS(dev, 20000);
+	return 0;
+}
+
+int ArdulinkDelayUS(void * dev, int microseconds) {
+	//fprintf(stderr, "Ardulink: faking delay %d\n", microseconds);
+	//usleep(microseconds);
+	return 0;
+}
+
+int ArdulinkExit(void * dev)
+{
+	serial_dev_close(&((ardulink_ctx_t*)dev)->serial);
+	free(dev);
+	return 0;
+}
+
+int ArdulinkSetupInterface( void * dev )
+{
+	char first;
+	// Let the bootloader do its thing.
+	MCF.DelayUS(dev, 3UL*1000UL*1000UL);
+
+	if (serial_dev_read(&((ardulink_ctx_t*)dev)->serial, &first, 1) == -1) {
+		perror("read");
+		return -1;
+	}
+
+	if (first != '!') {
+		fprintf(stderr, "Ardulink: not the sync character.\n");
+		return -1;
+	}
+
+	return 0;
+}
+
+void * TryInit_Ardulink(const init_hints_t* hints)
+{
+	ardulink_ctx_t *ctx;
+
+	if (!(ctx = calloc(sizeof(ardulink_ctx_t), 1))) {
+		perror("calloc");
+		return NULL;
+	}
+
+	const char* serial_to_open = NULL;
+	// Get the serial port that shall be opened.
+	// First, if we have a directly set serial port hint, use that.
+	// Otherwise, use the environment variable MINICHLINK_SERIAL.
+	// If that also doesn't exist, fall back to the default serial.
+	if (hints && hints->serial_port != NULL) {
+		serial_to_open = hints->serial_port;
+	}
+	else if ((serial_to_open = getenv("MINICHLINK_SERIAL")) == NULL) {
+		// fallback
+		serial_to_open = DEFAULT_SERIAL_NAME;
+	}
+
+	if (serial_dev_create(&ctx->serial, serial_to_open, 115200) == -1) {
+		perror("create");
+		return NULL;
+	}
+
+	if (serial_dev_open(&ctx->serial) == -1) {
+		perror("open");
+		return NULL;
+	}
+
+	// Arduino DTR reset.
+	if (serial_dev_do_dtr_reset(&ctx->serial) == -1) {
+		perror("dtr reset");
+		return NULL;
+	}
+
+	// Flush anything that might be in the RX buffer, we need the sync char.
+	if (serial_dev_flush_rx(&ctx->serial) == -1) {
+		perror("flush rx");
+		return NULL;
+	}
+
+	fprintf(stderr, "Ardulink: synced.\n");
+
+	MCF.WriteReg32 = ArdulinkWriteReg32;
+	MCF.ReadReg32 = ArdulinkReadReg32;
+	MCF.FlushLLCommands = ArdulinkFlushLLCommands;
+	MCF.Control3v3 = ArdulinkControl3v3;
+	MCF.DelayUS = ArdulinkDelayUS;
+	MCF.Exit = ArdulinkExit;
+	MCF.SetupInterface = ArdulinkSetupInterface;
+
+	return ctx;
+}
diff --git a/minichlink/hidapi.c b/minichlink/hidapi.c
index 1056c56104437fa54021e88f0fd33c20c8751f4a..04c6dced558f81fddba17a240c6dee98aa1c4e2d 100644
--- a/minichlink/hidapi.c
+++ b/minichlink/hidapi.c
@@ -2202,6 +2202,8 @@ int main(void)
 
 #include "hidapi.h"
 
+int g_hidapiSuppress;
+
 /* Definitions from linux/hidraw.h. Since these are new, some distros
    may not have header files which contain them. */
 #ifndef HIDIOCSFEATURE
@@ -2796,14 +2798,14 @@ hid_device * HID_API_EXPORT hid_open_path(const char *path)
 
 		/* Get Report Descriptor Size */
 		res = ioctl(dev->device_handle, HIDIOCGRDESCSIZE, &desc_size);
-		if (res < 0)
+		if (res < 0 && !g_hidapiSuppress)
 			perror("HIDIOCGRDESCSIZE");
 
 
 		/* Get Report Descriptor */
 		rpt_desc.size = desc_size;
 		res = ioctl(dev->device_handle, HIDIOCGRDESC, &rpt_desc);
-		if (res < 0) {
+		if (res < 0 && !g_hidapiSuppress) {
 			perror("HIDIOCGRDESC");
 		} else {
 			/* Determine if this device uses numbered reports. */
@@ -2899,7 +2901,7 @@ int HID_API_EXPORT hid_send_feature_report(hid_device *dev, const unsigned char
 	int res;
 
 	res = ioctl(dev->device_handle, HIDIOCSFEATURE(length), data);
-	if (res < 0)
+	if (res < 0 && !g_hidapiSuppress)
 		perror("ioctl (SFEATURE)");
 
 	return res;
@@ -2910,7 +2912,7 @@ int HID_API_EXPORT hid_get_feature_report(hid_device *dev, unsigned char *data,
 	int res;
 
 	res = ioctl(dev->device_handle, HIDIOCGFEATURE(length), data);
-	if (res < 0)
+	if (res < 0 && !g_hidapiSuppress)
 		perror("ioctl (GFEATURE)");
 
 
diff --git a/minichlink/microgdbstub.h b/minichlink/microgdbstub.h
index da9246f835ff697825a706f14ffb6111982df523..39129034ff09ce4079bbceb90ae8eb05842e64c0 100644
--- a/minichlink/microgdbstub.h
+++ b/minichlink/microgdbstub.h
@@ -29,6 +29,7 @@ void RVDebugExec( void * dev, int halt_reset_or_resume );
 int RVReadMem( void * dev, uint32_t memaddy, uint8_t * payload, int len );
 int RVHandleBreakpoint( void * dev, int set, uint32_t address );
 int RVWriteRAM(void * dev, uint32_t memaddy, uint32_t length, uint8_t * payload );
+void RVCommandResetPart( void * dev );
 void RVHandleDisconnect( void * dev );
 void RVHandleGDBBreakRequest( void * dev );
 void RVHandleKillRequest( void * dev );
@@ -61,7 +62,7 @@ typedef struct pollfd { SOCKET fd; SHORT  events; SHORT  revents; };
 #define POLLIN 0x0001
 #define POLLERR 0x008
 #define POLLHUP 0x010
-int WSAAPI WSAPoll(struct pollfd * fdArray, ULONG       fds, INT         timeout );
+int WSAAPI WSAPoll(struct pollfd * fdArray, ULONG	   fds, INT		 timeout );
 #endif
 #define poll WSAPoll
 #define socklen_t uint32_t
@@ -171,19 +172,33 @@ void HandleGDBPacket( void * dev, char * data, int len )
 	{
 	case 'q':
 		if( StringMatch( data, "Attached" ) )
-		    SendReplyFull( "1" ); //Attached to an existing process.
+			SendReplyFull( "1" ); //Attached to an existing process.
 		else if( StringMatch( data, "Supported" ) )
-		    SendReplyFull( "PacketSize=f000;qXfer:memory-map:read+" );
+			SendReplyFull( "PacketSize=f000;qXfer:memory-map:read+" );
 		else if( StringMatch( data, "C") ) // Get Current Thread ID. (Can't be -1 or 0.  Those are special)
-		    SendReplyFull( "QC1" );
+			SendReplyFull( "QC1" );
 		else if( StringMatch( data, "fThreadInfo" ) )  // Query all active thread IDs (Can't be 0 or 1)
 			SendReplyFull( "m1" );
 		else if( StringMatch( data, "sThreadInfo" ) )  // Query all active thread IDs, continued
-		    SendReplyFull( "l" );
+			SendReplyFull( "l" );
+		else if( StringMatch( data, "Rcmd,7265736574" ) )  // "monitor reset"
+		{
+			RVCommandResetPart( dev ); // Force reset
+			SendReplyFull( "+" );
+		}
 		else if( StringMatch( data, "Xfer:memory-map" ) )
-		    SendReplyFull( MICROGDBSTUB_MEMORY_MAP );
+		{
+			int mslen = strlen( MICROGDBSTUB_MEMORY_MAP ) + 32;
+			char map[mslen];
+			struct InternalState * iss = (struct InternalState*)(((struct ProgrammerStructBase*)dev)->internal);
+			snprintf( map, mslen, MICROGDBSTUB_MEMORY_MAP, iss->flash_size, iss->sector_size, iss->ram_size );
+			SendReplyFull( map );
+		}
 		else
+		{
+			printf( "Unknown command: %s\n", data );
 			SendReplyFull( "" );
+		}
 		break;
 	case 'c':
 	case 'C':
@@ -487,6 +502,7 @@ static int GDBListen( void * dev )
 		serverSocket = 0;
 		return -1;
 	}
+	
 	return 0;
 }
 
@@ -494,14 +510,16 @@ int MicroGDBPollServer( void * dev )
 {
 	if( !serverSocket ) return -4;
 
-	struct pollfd allpolls[2];
-
 	int pollct = 1;
+	struct pollfd allpolls[1] = { 0 };
 	allpolls[0].fd = serverSocket;
-	allpolls[0].events = POLLIN;
-
-	//Do something to watch all currently-waiting sockets.
-	poll( allpolls, pollct, 0 );
+	allpolls[0].events = 0x00000100; //POLLRDNORM;
+	int r = poll( allpolls, pollct, 0 );
+	
+	if( r < 0 )
+	{
+		printf( "R: %d\n", r );
+	}
 
 	//If there's faults, bail.
 	if( allpolls[0].revents & (POLLERR|POLLHUP) )
@@ -608,18 +626,16 @@ int MicroGDBStubStartup( void * dev )
 {
 #if defined( WIN32 ) || defined( _WIN32 )
 {
-    WORD wVersionRequested;
-    WSADATA wsaData;
-    int err;
-    wVersionRequested = MAKEWORD(2, 2);
-
-    err = WSAStartup(wVersionRequested, &wsaData);
-    if (err != 0) {
-        /* Tell the user that we could not find a usable */
-        /* Winsock DLL.                                  */
-        fprintf( stderr, "WSAStartup failed with error: %d\n", err);
-        return 1;
-    }
+	WORD wVersionRequested;
+	WSADATA wsaData;
+	int err;
+	wVersionRequested = MAKEWORD(2, 2);
+
+	err = WSAStartup(wVersionRequested, &wsaData);
+	if (err != 0) {
+		fprintf( stderr, "WSAStartup failed with error: %d\n", err);
+		return 1;
+	}
 }
 #endif
 
diff --git a/minichlink/minichgdb.c b/minichlink/minichgdb.c
index 0596f4fdaed6bde45741cad77193ee18d70ece0e..7cce3e21221640c9bad4ae0f04f2239a8a64e23c 100644
--- a/minichlink/minichgdb.c
+++ b/minichlink/minichgdb.c
@@ -9,16 +9,18 @@
 #define MICROGDBSTUB_SOCKETS
 #define MICROGDBSTUB_PORT 2000
 
-
 const char* MICROGDBSTUB_MEMORY_MAP = "l<?xml version=\"1.0\"?>"
 "<!DOCTYPE memory-map PUBLIC \"+//IDN gnu.org//DTD GDB Memory Map V1.0//EN\" \"http://sourceware.org/gdb/gdb-memory-map.dtd\">"
 "<memory-map>"
-"  <memory type=\"flash\" start=\"0x00000000\" length=\"0x4000\">"
-"    <property name=\"blocksize\">64</property>"
+"  <memory type=\"flash\" start=\"0x00000000\" length=\"0x%x\">"
+"    <property name=\"blocksize\">%d</property>"
 "  </memory>"
-"  <memory type=\"ram\" start=\"0x20000000\" length=\"0x800\">"
+"  <memory type=\"ram\" start=\"0x20000000\" length=\"0x%x\">"
 "    <property name=\"blocksize\">1</property>"
 "  </memory>"
+"  <memory type=\"ram\" start=\"0x40000000\" length=\"0x10000000\">"
+"    <property name=\"blocksize\">4</property>"
+"  </memory>"
 "</memory-map>";
 
 #include "microgdbstub.h"
@@ -69,10 +71,16 @@ void RVCommandEpilogue( void * dev )
 	MCF.WriteReg32( dev, DMDATA0, 0 );
 }
 
+void RVCommandResetPart( void * dev )
+{
+	MCF.HaltMode( dev, HALT_MODE_HALT_AND_RESET );
+	RVCommandPrologue( dev );
+}
+
 void RVNetConnect( void * dev )
 {
 	// ??? Should we actually halt?
-	MCF.HaltMode( dev, 0 );
+	MCF.HaltMode( dev, 5 );
 	MCF.SetEnableBreakpoints( dev, 1, 0 );
 	RVCommandPrologue( dev );
 	shadow_running_state = 0;
@@ -125,7 +133,7 @@ int RVReadCPURegister( void * dev, int regno, uint32_t * regret )
 {
 	if( shadow_running_state )
 	{
-		MCF.HaltMode( dev, 0 );
+		MCF.HaltMode( dev, 5 );
 		RVCommandPrologue( dev );
 		shadow_running_state = 0;
 	}
@@ -349,7 +357,7 @@ int RVWriteRAM(void * dev, uint32_t memaddy, uint32_t length, uint8_t * payload
 
 void RVHandleDisconnect( void * dev )
 {
-	MCF.HaltMode( dev, 0 );
+	MCF.HaltMode( dev, 5 );
 	MCF.SetEnableBreakpoints( dev, 0, 0 );
 
 	int i;
@@ -373,7 +381,7 @@ void RVHandleGDBBreakRequest( void * dev )
 {
 	if( shadow_running_state )
 	{
-		MCF.HaltMode( dev, 0 );
+		MCF.HaltMode( dev, 5 );
 	}
 }
 
diff --git a/minichlink/minichlink.c b/minichlink/minichlink.c
index 8c3e5ecd3b7c5353a9713945985c4be9931cff0e..cf3f32542a3f32cbfbc33f11f635f6329b458992 100644
--- a/minichlink/minichlink.c
+++ b/minichlink/minichlink.c
@@ -8,46 +8,84 @@
 #include <stdio.h>
 #include <string.h>
 #include <stdlib.h>
+#include <getopt.h>
+#include "terminalhelp.h"
 #include "minichlink.h"
 #include "../ch32v003fun/ch32v003fun.h"
 
 #if defined(WINDOWS) || defined(WIN32) || defined(_WIN32)
+#ifndef _SYNCHAPI_H_
 void Sleep(uint32_t dwMilliseconds);
+#endif
 #else
 #include <unistd.h>
 #endif
 
-
 static int64_t StringToMemoryAddress( const char * number ) __attribute__((used));
 static void StaticUpdatePROGBUFRegs( void * dev ) __attribute__((used));
-static int InternalUnlockBootloader( void * dev ) __attribute__((used));
 int DefaultReadBinaryBlob( void * dev, uint32_t address_to_read_from, uint32_t read_size, uint8_t * blob );
 
 void TestFunction(void * v );
 struct MiniChlinkFunctions MCF;
 
-void * MiniCHLinkInitAsDLL( struct MiniChlinkFunctions ** MCFO )
+void * MiniCHLinkInitAsDLL( struct MiniChlinkFunctions ** MCFO, const init_hints_t* init_hints )
 {
 	void * dev = 0;
-	if( (dev = TryInit_WCHLinkE()) )
-	{
-		fprintf( stderr, "Found WCH Link\n" );
-	}
-	else if( (dev = TryInit_ESP32S2CHFUN()) )
+	
+	const char * specpgm = init_hints->specific_programmer;
+	if( specpgm )
 	{
-		fprintf( stderr, "Found ESP32S2 Programmer\n" );
+		if( strcmp( specpgm, "linke" ) == 0 )
+			dev = TryInit_WCHLinkE();
+		else if( strcmp( specpgm, "esp32s2chfun" ) == 0 )
+			dev = TryInit_ESP32S2CHFUN();
+		else if( strcmp( specpgm, "nchlink" ) == 0 )
+			dev = TryInit_NHCLink042();
+		else if( strcmp( specpgm, "b003boot" ) == 0 )
+			dev = TryInit_B003Fun();
+		else if( strcmp( specpgm, "ardulink" ) == 0 )
+			dev = TryInit_B003Fun();
 	}
-	else if ((dev = TryInit_NHCLink042()))
+	else
 	{
-		fprintf( stderr, "Found NHC-Link042 Programmer\n" );
+		if( (dev = TryInit_WCHLinkE()) )
+		{
+			fprintf( stderr, "Found WCH Link\n" );
+		}
+		else if( (dev = TryInit_ESP32S2CHFUN()) )
+		{
+			fprintf( stderr, "Found ESP32S2 Programmer\n" );
+		}
+		else if ((dev = TryInit_NHCLink042()))
+		{
+			fprintf( stderr, "Found NHC-Link042 Programmer\n" );
+		}
+		else if ((dev = TryInit_B003Fun()))
+		{
+			fprintf( stderr, "Found B003Fun Bootloader\n" );
+		}
+		else if ( init_hints->serial_port && (dev = TryInit_Ardulink(init_hints)))
+		{
+			fprintf( stderr, "Found Ardulink Programmer\n" );
+		}
 	}
-	else
+
+	if( !dev )
 	{
 		fprintf( stderr, "Error: Could not initialize any supported programmers\n" );
 		return 0;
 	}
 
+	struct InternalState * iss = calloc( 1, sizeof( struct InternalState ) );
+	((struct ProgrammerStructBase*)dev)->internal = iss;
+	iss->ram_base = 0x20000000;
+	iss->ram_size = 2048;
+	iss->sector_size = 64;
+	iss->flash_size = 16384;
+	iss->target_chip_type = 0;
+
 	SetupAutomaticHighLevelFunctions( dev );
+
 	if( MCFO )
 	{
 		*MCFO = &MCF;
@@ -58,11 +96,34 @@ void * MiniCHLinkInitAsDLL( struct MiniChlinkFunctions ** MCFO )
 #if !defined( MINICHLINK_AS_LIBRARY ) && !defined( MINICHLINK_IMPORT )
 int main( int argc, char ** argv )
 {
+	int i;
+
 	if( argc > 1 && argv[1][0] == '-' && argv[1][1] == 'h' )
 	{
 		goto help;
 	}
-	void * dev = MiniCHLinkInitAsDLL( 0 );
+	init_hints_t hints;
+	memset(&hints, 0, sizeof(hints));
+
+	// Scan for possible hints.
+	for( i = 0; i < argc; i++ )
+	{
+		char * v = argv[i];
+		if( strncmp( v, "-c", 2 ) == 0 )
+		{
+			i++;
+			if( i < argc )
+				hints.serial_port = argv[i];
+		}
+		else if( strncmp( v, "-c", 2 ) == 0 )
+		{
+			i++;
+			if( i < argc )
+				hints.specific_programmer = argv[i];
+		}
+	}
+
+	void * dev = MiniCHLinkInitAsDLL( 0, &hints );
 	if( !dev )
 	{
 		fprintf( stderr, "Error: Could not initialize any supported programmers\n" );
@@ -140,6 +201,17 @@ keep_going:
 				else
 					goto unimplemented;
 				break;
+			case 'C': // For specifying programmer
+			case 'c':
+				// COM port or programmer argument already parsed previously
+				// we still need to skip the next argument
+				iarg+=1;
+				if( iarg >= argc )
+				{
+					fprintf( stderr, "-c/C argument required 2 arguments\n" );
+					goto unimplemented;
+				}
+				break;
 			case 'u':
 				if( MCF.Unbrick )
 					MCF.Unbrick( dev );
@@ -152,55 +224,55 @@ keep_going:
 					goto unimplemented;
 				break;
 			case 'b':  //reBoot
-				if( !MCF.HaltMode || MCF.HaltMode( dev, 1 ) )
+				if( !MCF.HaltMode || MCF.HaltMode( dev, HALT_MODE_REBOOT ) )
 					goto unimplemented;
 				break;
 			case 'B':  //reBoot into Bootloader
-				if( !MCF.HaltMode || MCF.HaltMode( dev, 3 ) )
+				if( !MCF.HaltMode || MCF.HaltMode( dev, HALT_MODE_GO_TO_BOOTLOADER ) )
 					goto unimplemented;
 				break;
 			case 'e':  //rEsume
-				if( !MCF.HaltMode || MCF.HaltMode( dev, 2 ) )
+				if( !MCF.HaltMode || MCF.HaltMode( dev, HALT_MODE_RESUME ) )
 					goto unimplemented;
 				break;
 			case 'E':  //Erase whole chip.
-				if( MCF.HaltMode ) MCF.HaltMode( dev, 0 );
+				if( MCF.HaltMode ) MCF.HaltMode( dev, HALT_MODE_HALT_AND_RESET );
 				if( !MCF.Erase || MCF.Erase( dev, 0, 0, 1 ) )
 					goto unimplemented;
 				break;
 			case 'a':
-				if( !MCF.HaltMode || MCF.HaltMode( dev, 0 ) )
+				if( !MCF.HaltMode || MCF.HaltMode( dev, HALT_MODE_HALT_AND_RESET ) )
 					goto unimplemented;
 				break;
 			case 'A':  // Halt without reboot
-				if( !MCF.HaltMode || MCF.HaltMode( dev, 5 ) )
+				if( !MCF.HaltMode || MCF.HaltMode( dev, HALT_MODE_HALT_BUT_NO_RESET ) )
 					goto unimplemented;
 				break;
 
 			// disable NRST pin (turn it into a GPIO)
 			case 'd':  // see "RSTMODE" in datasheet
-				if( MCF.HaltMode ) MCF.HaltMode( dev, 0 );
+				if( MCF.HaltMode ) MCF.HaltMode( dev, HALT_MODE_HALT_AND_RESET );
 				if( MCF.ConfigureNRSTAsGPIO )
 					MCF.ConfigureNRSTAsGPIO( dev, 0 );
 				else
 					goto unimplemented;
 				break;
 			case 'D': // see "RSTMODE" in datasheet
-				if( MCF.HaltMode ) MCF.HaltMode( dev, 0 );
+				if( MCF.HaltMode ) MCF.HaltMode( dev, HALT_MODE_HALT_AND_RESET );
 				if( MCF.ConfigureNRSTAsGPIO )
 					MCF.ConfigureNRSTAsGPIO( dev, 1 );
 				else
 					goto unimplemented;
 				break;
 			case 'p': 
-				if( MCF.HaltMode ) MCF.HaltMode( dev, 0 );
+				if( MCF.HaltMode ) MCF.HaltMode( dev, HALT_MODE_HALT_AND_RESET );
 				if( MCF.ConfigureReadProtection )
 					MCF.ConfigureReadProtection( dev, 0 );
 				else
 					goto unimplemented;
 				break;
 			case 'P':
-				if( MCF.HaltMode ) MCF.HaltMode( dev, 0 );
+				if( MCF.HaltMode ) MCF.HaltMode( dev, HALT_MODE_HALT_AND_RESET );
 				if( MCF.ConfigureReadProtection )
 					MCF.ConfigureReadProtection( dev, 1 );
 				else
@@ -221,27 +293,48 @@ keep_going:
 				{
 					printf( "GDBServer Running\n" );
 				}
-				else
+				else if( argchar[1] == 'T' )
 				{
 					// In case we aren't running already.
-					//MCF.HaltMode( dev, 2 );
-					//XXX TODO: Why do some programmers start automatically, and others don't? 
+					MCF.HaltMode( dev, 2 );
 				}
 
+				CaptureKeyboardInput();
+
+				uint32_t appendword = 0;
 				do
 				{
 					uint8_t buffer[256];
 					if( !IsGDBServerInShadowHaltState( dev ) )
 					{
-						int r = MCF.PollTerminal( dev, buffer, sizeof( buffer ), 0, 0 );
-						if( r < 0 )
+						// Handle keyboard input.
+						if( appendword == 0 )
+						{
+							int i;
+							for( i = 0; i < 3; i++ )
+							{
+								if( !IsKBHit() ) break;
+								appendword |= ReadKBByte() << (i*8+8);
+							}
+							appendword |= i+4; // Will go into DATA0.
+						}
+						int r = MCF.PollTerminal( dev, buffer, sizeof( buffer ), appendword, 0 );
+						if( r == -1 )
+						{
+							// Other end ack'd without printf.
+							appendword = 0;
+						}
+						else if( r < 0 )
 						{
 							fprintf( stderr, "Terminal dead.  code %d\n", r );
 							return -32;
 						}
-						if( r > 0 )
+						else if( r > 0 )
 						{
-							fwrite( buffer, r, 1, stdout ); 
+							fwrite( buffer, r, 1, stdout );
+							fflush( stdout );
+							// Otherwise it's basically just an ack for appendword.
+							appendword = 0;
 						}
 					}
 
@@ -343,7 +436,7 @@ keep_going:
 			}
 			case 'r':
 			{
-				if( MCF.HaltMode ) MCF.HaltMode( dev, 5 ); //No need to reboot.
+				if( MCF.HaltMode ) MCF.HaltMode( dev, HALT_MODE_HALT_BUT_NO_RESET ); //No need to reboot.
 
 				if( argchar[2] != 0 )
 				{
@@ -367,8 +460,6 @@ keep_going:
 					return -9;
 				}
 
-				// Round up amount.
-				amount = ( amount + 3 ) & 0xfffffffc;
 				FILE * f = 0;
 				int hex = 0;
 				if( strcmp( fname, "-" ) == 0 )
@@ -423,6 +514,7 @@ keep_going:
 			}
 			case 'w':
 			{
+				struct InternalState * iss = (struct InternalState*)(((struct ProgrammerStructBase*)dev)->internal);
 				if( argchar[2] != 0 ) goto help;
 				iarg++;
 				argchar = 0; // Stop advancing
@@ -503,14 +595,14 @@ keep_going:
 					fprintf( stderr, "Error: File I/O Fault.\n" );
 					exit( -10 );
 				}
-				if( len > 16384 )
+				if( len > iss->flash_size )
 				{
 					fprintf( stderr, "Error: Image for CH32V003 too large (%d)\n", len );
 					exit( -9 );
 				}
 
-				int is_flash = ( offset & 0xff000000 ) == 0x08000000 || ( offset & 0x1FFFF800 ) == 0x1FFFF000;
-				if( MCF.HaltMode ) MCF.HaltMode( dev, is_flash?0:5 );
+				int is_flash = IsAddressFlash( offset );
+				if( MCF.HaltMode ) MCF.HaltMode( dev, is_flash ? HALT_MODE_HALT_AND_RESET : HALT_MODE_HALT_BUT_NO_RESET );
 
 				if( MCF.WriteBinaryBlob )
 				{
@@ -551,7 +643,9 @@ help:
 	fprintf( stderr, " -5 Enable 5V\n" );
 	fprintf( stderr, " -t Disable 3.3V\n" );
 	fprintf( stderr, " -f Disable 5V\n" );
+	fprintf( stderr, " -c [serial port for Ardulink, try /dev/ttyACM0 or COM11 etc]\n" );
 	fprintf( stderr, " -u Clear all code flash - by power off (also can unbrick)\n" );
+	fprintf( stderr, " -E Erase chip\n" );
 	fprintf( stderr, " -b Reboot out of Halt\n" );
 	fprintf( stderr, " -e Resume from halt\n" );
 	fprintf( stderr, " -a Reboot into Halt\n" );
@@ -583,8 +677,6 @@ unimplemented:
 #define strtoll _strtoi64
 #endif
 
-static int StaticUnlockFlash( void * dev, struct InternalState * iss );
-
 int64_t SimpleReadNumberInt( const char * number, int64_t defaultNumber )
 {
 	if( !number || !number[0] ) return defaultNumber;
@@ -639,7 +731,7 @@ static int DefaultWaitForFlash( void * dev )
 		rw = 0;
 		MCF.ReadWord( dev, (intptr_t)&FLASH->STATR, &rw ); // FLASH_STATR => 0x4002200C
 		if( timeout++ > 100 ) return -1;
-	} while(rw & 1);  // BSY flag.
+	} while(rw & 3);  // BSY flag for 003, or WRBSY for other processors.
 
 	if( rw & FLASH_STATR_WRPRTERR )
 	{
@@ -654,67 +746,6 @@ static int DefaultWaitForDoneOp( void * dev, int ignore )
 {
 	int r;
 	uint32_t rrv;
-	uint32_t temp;
-
-	//Debug regdump pre-command
-	#if 0
-	MCF.ReadReg32(dev, DMDATA0, &temp);
-	fprintf(stderr, "Pr-DMDATA0: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMDATA1, &temp);
-	fprintf(stderr, "Pr-DMDATA1: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMCONTROL, &temp);
-	fprintf(stderr, "Pr-DMCONTROL: %08x\n", temp);
-	
-	MCF.ReadReg32(dev, DMSTATUS, &temp);
-	fprintf(stderr, "Pr-DMSTATUS: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMHARTINFO, &temp);
-	fprintf(stderr, "Pr-DMHARTINFO: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMABSTRACTCS, &temp);
-	fprintf(stderr, "Pr-DMABSTRACTCS: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMCOMMAND, &temp);
-	fprintf(stderr, "Pr-DMCOMMAND: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMABSTRACTAUTO, &temp);
-	fprintf(stderr, "Pr-DMABSTRACTAUTO: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF0, &temp);
-	fprintf(stderr, "Pr-DMPROGBUF0: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF1, &temp);
-	fprintf(stderr, "Pr-DMPROGBUF1: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF2, &temp);
-	fprintf(stderr, "Pr-DMPROGBUF2: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF3, &temp);
-	fprintf(stderr, "Pr-DMPROGBUF3: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF4, &temp);
-	fprintf(stderr, "Pr-DMPROGBUF4: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF5, &temp);
-	fprintf(stderr, "Pr-DMPROGBUF5: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF6, &temp);
-	fprintf(stderr, "Pr-DMPROGBUF6: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF7, &temp);
-	fprintf(stderr, "Pr-DMPROGBUF7: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMCPBR, &temp);
-	fprintf(stderr, "Pr-DMCPBR: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMCFGR, &temp);
-	fprintf(stderr, "Pr-DMCFGR: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMSHDWCFGR, &temp);
-	fprintf(stderr, "Pr-DMSHDWCFGR: %08x\n", temp);
-	#endif
 
 	do
 	{
@@ -723,66 +754,6 @@ static int DefaultWaitForDoneOp( void * dev, int ignore )
 	}
 	while( rrv & (1<<12) );
 
-	//Debug regdump post-command
-	#if 0
-	MCF.ReadReg32(dev, DMDATA0, &temp);
-	fprintf(stderr, "Po-DMDATA0: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMDATA1, &temp);
-	fprintf(stderr, "Po-DMDATA1: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMCONTROL, &temp);
-	fprintf(stderr, "Po-DMCONTROL: %08x\n", temp);
-	
-	MCF.ReadReg32(dev, DMSTATUS, &temp);
-	fprintf(stderr, "Po-DMSTATUS: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMHARTINFO, &temp);
-	fprintf(stderr, "Po-DMHARTINFO: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMABSTRACTCS, &temp);
-	fprintf(stderr, "Po-DMABSTRACTCS: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMCOMMAND, &temp);
-	fprintf(stderr, "Po-DMCOMMAND: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMABSTRACTAUTO, &temp);
-	fprintf(stderr, "Po-DMABSTRACTAUTO: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF0, &temp);
-	fprintf(stderr, "Po-DMPROGBUF0: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF1, &temp);
-	fprintf(stderr, "Po-DMPROGBUF1: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF2, &temp);
-	fprintf(stderr, "Po-DMPROGBUF2: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF3, &temp);
-	fprintf(stderr, "Po-DMPROGBUF3: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF4, &temp);
-	fprintf(stderr, "Po-DMPROGBUF4: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF5, &temp);
-	fprintf(stderr, "Po-DMPROGBUF5: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF6, &temp);
-	fprintf(stderr, "Po-DMPROGBUF6: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMPROGBUF7, &temp);
-	fprintf(stderr, "Po-DMPROGBUF7: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMCPBR, &temp);
-	fprintf(stderr, "Po-MCPBR: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMCFGR, &temp);
-	fprintf(stderr, "Po-DMCFGR: %08x\n", temp);
-
-	MCF.ReadReg32(dev, DMSHDWCFGR, &temp);
-	fprintf(stderr, "Po-DMSHDWCFGR: %08x\n", temp);
-	#endif
-
 	if( (rrv >> 8 ) & 7 )
 	{
 		if( !ignore )
@@ -799,6 +770,7 @@ static int DefaultWaitForDoneOp( void * dev, int ignore )
 			default: errortext = "Other Error"; break;
 			}
 
+			uint32_t temp;
 			MCF.ReadReg32( dev, DMSTATUS, &temp );
 			fprintf( stderr, "Fault writing memory (DMABSTRACTS = %08x) (%s) DMSTATUS: %08x\n", rrv, errortext, temp );
 		}
@@ -813,7 +785,7 @@ int DefaultSetupInterface( void * dev )
 	struct InternalState * iss = (struct InternalState*)(((struct ProgrammerStructBase*)dev)->internal);
 
 	if( MCF.Control3v3 ) MCF.Control3v3( dev, 1 );
-	if( MCF.DelayUS ) MCF.DelayUS( dev, 16000 );
+	MCF.DelayUS( dev, 16000 );
 	MCF.WriteReg32( dev, DMSHDWCFGR, 0x5aa50000 | (1<<10) ); // Shadow Config Reg
 	MCF.WriteReg32( dev, DMCFGR, 0x5aa50000 | (1<<10) ); // CFGR (1<<10 == Allow output from slave)
 	MCF.WriteReg32( dev, DMCFGR, 0x5aa50000 | (1<<10) ); // Bug in silicon?  If coming out of cold boot, and we don't do our little "song and dance" this has to be called.
@@ -859,7 +831,7 @@ static void StaticUpdatePROGBUFRegs( void * dev )
 	MCF.WriteReg32( dev, DMCOMMAND, 0x0023100d );      // Copy data to x13
 }
 
-static int InternalUnlockBootloader( void * dev )
+int InternalUnlockBootloader( void * dev )
 {
 	if( !MCF.WriteWord ) return -99;
 	int ret = 0;
@@ -884,6 +856,23 @@ static int InternalUnlockBootloader( void * dev )
 }
 
 
+int InternalIsMemoryErased( struct InternalState * iss, uint32_t address )
+{
+	if(( address & 0xff000000 ) != 0x08000000 ) return 0;
+	int sector = (address & 0xffffff) / iss->sector_size;
+	if( sector >= MAX_FLASH_SECTORS )
+		return 0;
+	else
+		return iss->flash_sector_status[sector];
+}
+
+void InternalMarkMemoryNotErased( struct InternalState * iss, uint32_t address )
+{
+	if(( address & 0xff000000 ) != 0x08000000 ) return;
+	int sector = (address & 0xffffff) / iss->sector_size;
+	if( sector < MAX_FLASH_SECTORS )
+		iss->flash_sector_status[sector] = 0;
+}
 
 static int DefaultWriteHalfWord( void * dev, uint32_t address_to_write, uint16_t data )
 {
@@ -1003,12 +992,7 @@ static int DefaultWriteWord( void * dev, uint32_t address_to_write, uint32_t dat
 	struct InternalState * iss = (struct InternalState*)(((struct ProgrammerStructBase*)dev)->internal);
 	int ret = 0;
 
-	int is_flash = 0;
-	if( ( address_to_write & 0xff000000 ) == 0x08000000 || ( address_to_write & 0x1FFFF800 ) == 0x1FFFF000 )
-	{
-		// Is flash.
-		is_flash = 1;
-	}
+	int is_flash = IsAddressFlash( address_to_write );
 
 	if( iss->statetag != STTAG( "WRSQ" ) || is_flash != iss->lastwriteflags )
 	{
@@ -1102,17 +1086,24 @@ int DefaultWriteBinaryBlob( void * dev, uint32_t address_to_write, uint32_t blob
 
 	uint32_t rw;
 	struct InternalState * iss = (struct InternalState*)(((struct ProgrammerStructBase*)dev)->internal);
-	int is_flash = 0;
+	int sectorsize = iss->sector_size;
+
+	// We can't write into flash when mapped to 0x00000000
+	if( address_to_write < 0x01000000 )
+		address_to_write |= 0x08000000;
+
+	int is_flash = IsAddressFlash( address_to_write );
 
 	if( blob_size == 0 ) return 0;
 
-	if( (address_to_write & 0xff000000) == 0x08000000 || (address_to_write & 0xff000000) == 0x00000000 || (address_to_write & 0x1FFFF800) == 0x1FFFF000 ) 
-		is_flash = 1;
 
-	// We can't write into flash when mapped to 0x00000000
-	if( is_flash )
-		address_to_write |= 0x08000000;
+	if( is_flash && !iss->flash_unlocked )
+	{
+		if( ( rw = InternalUnlockFlash( dev, iss ) ) )
+			return rw;
+	}
 
+	// Regardless of sector size, allow block write to do its thing if it can.
 	if( is_flash && MCF.BlockWrite64 && ( address_to_write & 0x3f ) == 0 && ( blob_size & 0x3f ) == 0 )
 	{
 		int i;
@@ -1128,50 +1119,48 @@ int DefaultWriteBinaryBlob( void * dev, uint32_t address_to_write, uint32_t blob
 		return 0;
 	}
 
-	if( is_flash && !iss->flash_unlocked )
-	{
-		if( ( rw = StaticUnlockFlash( dev, iss ) ) )
-			return rw;
-	}
-
-
-	uint8_t tempblock[64];
-	int sblock =  address_to_write >> 6;
-	int eblock = ( address_to_write + blob_size + 0x3f) >> 6;
+	uint8_t tempblock[sectorsize];
+	int sblock =  address_to_write / sectorsize;
+	int eblock = ( address_to_write + blob_size + (sectorsize-1) ) / sectorsize;
 	int b;
 	int rsofar = 0;
 
 	for( b = sblock; b < eblock; b++ )
 	{
-		int offset_in_block = address_to_write - (b * 64);
+		int offset_in_block = address_to_write - (b * sectorsize);
 		if( offset_in_block < 0 ) offset_in_block = 0;
-		int end_o_plus_one_in_block = ( address_to_write + blob_size ) - (b*64);
-		if( end_o_plus_one_in_block > 64 ) end_o_plus_one_in_block = 64;
-		int	base = b * 64;
+		int end_o_plus_one_in_block = ( address_to_write + blob_size ) - (b*sectorsize);
+		if( end_o_plus_one_in_block > sectorsize ) end_o_plus_one_in_block = sectorsize;
+		int	base = b * sectorsize;
 
-		if( offset_in_block == 0 && end_o_plus_one_in_block == 64 )
+		if( offset_in_block == 0 && end_o_plus_one_in_block == sectorsize )
 		{
-			if( MCF.BlockWrite64 ) 
+			if( MCF.BlockWrite64 )
 			{
-				int r = MCF.BlockWrite64( dev, base, blob + rsofar );
-				rsofar += 64;
-				if( r )
+				int i;
+				for( i = 0; i < sectorsize/64; i++ )
 				{
-					fprintf( stderr, "Error writing block at memory %08x (error = %d)\n", base, r );
-					return r;
+					int r = MCF.BlockWrite64( dev, base + i*64, blob + rsofar+i*64 );
+					rsofar += 64;
+					if( r )
+					{
+						fprintf( stderr, "Error writing block at memory %08x (error = %d)\n", base, r );
+						return r;
+					}
 				}
 			}
 			else 					// Block Write not avaialble
 			{
 				if( is_flash )
 				{
-					MCF.Erase( dev, base, 64, 0 );
+					if( !InternalIsMemoryErased( iss, base ) )
+						MCF.Erase( dev, base, sectorsize, 0 );
 					MCF.WriteWord( dev, 0x40022010, CR_PAGE_PG ); // THIS IS REQUIRED, (intptr_t)&FLASH->CTLR = 0x40022010
 					MCF.WriteWord( dev, 0x40022010, CR_BUF_RST | CR_PAGE_PG );  // (intptr_t)&FLASH->CTLR = 0x40022010
 				}
 
 				int j;
-				for( j = 0; j < 16; j++ )
+				for( j = 0; j < sectorsize/4; j++ )
 				{
 					uint32_t writeword;
 					memcpy( &writeword, blob + rsofar, 4 );
@@ -1182,8 +1171,10 @@ int DefaultWriteBinaryBlob( void * dev, uint32_t address_to_write, uint32_t blob
 				if( is_flash )
 				{
 					MCF.WriteWord( dev, 0x40022014, base );  //0x40022014 -> FLASH->ADDR
+					if( MCF.PrepForLongOp ) MCF.PrepForLongOp( dev );  // Give the programmer a headsup this next operation could take a while.
 					MCF.WriteWord( dev, 0x40022010, CR_PAGE_PG|CR_STRT_Set ); // 0x40022010 -> FLASH->CTLR
 					if( MCF.WaitForFlash ) MCF.WaitForFlash( dev );
+					InternalMarkMemoryNotErased( iss, base );
 				}
 			}
 		}
@@ -1192,7 +1183,7 @@ int DefaultWriteBinaryBlob( void * dev, uint32_t address_to_write, uint32_t blob
 			//Ok, we have to do something wacky.
 			if( is_flash )
 			{
-				MCF.ReadBinaryBlob( dev, base, 64, tempblock );
+				MCF.ReadBinaryBlob( dev, base, sectorsize, tempblock );
 
 				// Permute tempblock
 				int tocopy = end_o_plus_one_in_block - offset_in_block;
@@ -1201,23 +1192,29 @@ int DefaultWriteBinaryBlob( void * dev, uint32_t address_to_write, uint32_t blob
 
 				if( MCF.BlockWrite64 ) 
 				{
-					int r = MCF.BlockWrite64( dev, base, tempblock );
-					if( r ) return r;
+					int i;
+					for( i = 0; i < sectorsize/64; i++ )
+					{
+						int r = MCF.BlockWrite64( dev, base+i*64, tempblock+i*64 );
+						if( r ) return r;
+					}
 				}
 				else
 				{
-					MCF.Erase( dev, base, 64, 0 );
+					if( !InternalIsMemoryErased( iss, base ) )
+						MCF.Erase( dev, base, sectorsize, 0 );
 					MCF.WriteWord( dev, 0x40022010, CR_PAGE_PG ); // THIS IS REQUIRED, (intptr_t)&FLASH->CTLR = 0x40022010
 					MCF.WriteWord( dev, 0x40022010, CR_BUF_RST | CR_PAGE_PG );  // (intptr_t)&FLASH->CTLR = 0x40022010
 
 					int j;
-					for( j = 0; j < 16; j++ )
+					for( j = 0; j < sectorsize/4; j++ )
 					{
 						MCF.WriteWord( dev, j*4+base, *(uint32_t*)(tempblock + j * 4) );
 						rsofar += 4;
 					}
 					MCF.WriteWord( dev, 0x40022014, base );  //0x40022014 -> FLASH->ADDR
 					MCF.WriteWord( dev, 0x40022010, CR_PAGE_PG|CR_STRT_Set ); // 0x40022010 -> FLASH->CTLR
+					InternalMarkMemoryNotErased( iss, base );
 				}
 				if( MCF.WaitForFlash && MCF.WaitForFlash( dev ) ) goto timedout;
 			}
@@ -1225,7 +1222,7 @@ int DefaultWriteBinaryBlob( void * dev, uint32_t address_to_write, uint32_t blob
 			{
 				// Accessing RAM.  Be careful to only do the needed operations.
 				int j;
-				for( j = 0; j < 16; j++ )
+				for( j = 0; j < sectorsize; j++ )
 				{
 					uint32_t taddy = j*4;
 					if( offset_in_block <= taddy && end_o_plus_one_in_block >= taddy + 4 )
@@ -1285,7 +1282,7 @@ int DefaultWriteBinaryBlob( void * dev, uint32_t address_to_write, uint32_t blob
 	}
 #endif
 
-	if(MCF.DelayUS) MCF.DelayUS( dev, 100 ); // Why do we need this? (We seem to need this on the WCH programmers?)
+	MCF.DelayUS( dev, 100 ); // Why do we need this? (We seem to need this on the WCH programmers?)
 	return 0;
 timedout:
 	fprintf( stderr, "Timed out\n" );
@@ -1337,8 +1334,7 @@ static int DefaultReadWord( void * dev, uint32_t address_to_read, uint32_t * dat
 		}
 
 		MCF.WriteReg32( dev, DMDATA1, address_to_read );
-		//MCF.WriteReg32( dev, DMCOMMAND, 0x00241000 ); // Only execute. //AH removed as part of CH32V307 discussion
-		MCF.WriteReg32( dev, DMCOMMAND, 0x00261000 ); //AH replacement based on that discussion
+		MCF.WriteReg32( dev, DMCOMMAND, 0x00241000 ); 
 
 		iss->statetag = STTAG( "RDSQ" );
 		iss->currentstateval = address_to_read;
@@ -1357,21 +1353,29 @@ static int DefaultReadWord( void * dev, uint32_t address_to_read, uint32_t * dat
 	return r;
 }
 
-static int StaticUnlockFlash( void * dev, struct InternalState * iss )
+int InternalUnlockFlash( void * dev, struct InternalState * iss )
 {
+	int ret = 0;
 	uint32_t rw;
-	MCF.ReadWord( dev, 0x40022010, &rw );  // FLASH->CTLR = 0x40022010
+	ret = MCF.ReadWord( dev, 0x40022010, &rw );  // FLASH->CTLR = 0x40022010
 	if( rw & 0x8080 ) 
 	{
+		ret = MCF.WriteWord( dev, 0x40022004, 0x45670123 ); // FLASH->KEYR = 0x40022004
+		if( ret ) goto reterr;
+		ret = MCF.WriteWord( dev, 0x40022004, 0xCDEF89AB );
+		if( ret ) goto reterr;
+		ret = MCF.WriteWord( dev, 0x40022008, 0x45670123 ); // OBKEYR = 0x40022008
+		if( ret ) goto reterr;
+		ret = MCF.WriteWord( dev, 0x40022008, 0xCDEF89AB );
+		if( ret ) goto reterr;
+		ret = MCF.WriteWord( dev, 0x40022024, 0x45670123 ); // MODEKEYR = 0x40022024
+		if( ret ) goto reterr;
+		ret = MCF.WriteWord( dev, 0x40022024, 0xCDEF89AB );
+		if( ret ) goto reterr;
+
+		ret = MCF.ReadWord( dev, 0x40022010, &rw ); // FLASH->CTLR = 0x40022010
+		if( ret ) goto reterr;
 
-		MCF.WriteWord( dev, 0x40022004, 0x45670123 ); // FLASH->KEYR = 0x40022004
-		MCF.WriteWord( dev, 0x40022004, 0xCDEF89AB );
-		MCF.WriteWord( dev, 0x40022008, 0x45670123 ); // OBKEYR = 0x40022008
-		MCF.WriteWord( dev, 0x40022008, 0xCDEF89AB );
-		MCF.WriteWord( dev, 0x40022024, 0x45670123 ); // MODEKEYR = 0x40022024
-		MCF.WriteWord( dev, 0x40022024, 0xCDEF89AB );
-
-		MCF.ReadWord( dev, 0x40022010, &rw ); // FLASH->CTLR = 0x40022010
 		if( rw & 0x8080 ) 
 		{
 			fprintf( stderr, "Error: Flash is not unlocked (CTLR = %08x)\n", rw );
@@ -1380,6 +1384,9 @@ static int StaticUnlockFlash( void * dev, struct InternalState * iss )
 	}
 	iss->flash_unlocked = 1;
 	return 0;
+reterr:
+	fprintf( stderr, "Error unlocking flash, got code %d from underlying system\n", ret );
+	return ret;
 }
 
 int DefaultErase( void * dev, uint32_t address, uint32_t length, int type )
@@ -1389,9 +1396,8 @@ int DefaultErase( void * dev, uint32_t address, uint32_t length, int type )
 
 	if( !iss->flash_unlocked )
 	{
-		if( ( rw = StaticUnlockFlash( dev, iss ) ) )
+		if( ( rw = InternalUnlockFlash( dev, iss ) ) )
 			return rw;
-		printf( "Flash unlocked\n" );
 	}
 
 	if( type == 1 )
@@ -1399,37 +1405,47 @@ int DefaultErase( void * dev, uint32_t address, uint32_t length, int type )
 		// Whole-chip flash
 		iss->statetag = STTAG( "XXXX" );
 		printf( "Whole-chip erase\n" );
-		MCF.WriteWord( dev, (intptr_t)&FLASH->CTLR, 0 );
-		MCF.WriteWord( dev, (intptr_t)&FLASH->CTLR, FLASH_CTLR_MER  );
-		MCF.WriteWord( dev, (intptr_t)&FLASH->CTLR, CR_STRT_Set|FLASH_CTLR_MER );
+		if( MCF.WriteWord( dev, (intptr_t)&FLASH->CTLR, 0 ) ) goto flashoperr;
+		if( MCF.WriteWord( dev, (intptr_t)&FLASH->CTLR, FLASH_CTLR_MER  ) ) goto flashoperr;
+		if( MCF.PrepForLongOp ) MCF.PrepForLongOp( dev );  // Give the programmer a headsup this next operation could take a while.
+		if( MCF.WriteWord( dev, (intptr_t)&FLASH->CTLR, CR_STRT_Set|FLASH_CTLR_MER ) ) goto flashoperr;
 		rw = MCF.WaitForDoneOp( dev, 0 );
 		if( MCF.WaitForFlash && MCF.WaitForFlash( dev ) ) { fprintf( stderr, "Error: Wait for flash error.\n" ); return -11; }
-		rw = MCF.WaitForDoneOp( dev, 0 );
-		MCF.WriteWord( dev, (intptr_t)&FLASH->CTLR, 0 );
-		rw = MCF.WaitForDoneOp( dev, 0 );
-		fprintf( stderr, "Whole Chip Erase Code: %d\n", rw );
+		MCF.VoidHighLevelState( dev );
+		memset( iss->flash_sector_status, 1, sizeof( iss->flash_sector_status ) );
 	}
 	else
 	{
 		// 16.4.7, Step 3: Check the BSY bit of the FLASH_STATR register to confirm that there are no other programming operations in progress.
 		// skip (we make sure at the end)
-
 		int chunk_to_erase = address;
-
 		while( chunk_to_erase < address + length )
 		{
+			if( ( chunk_to_erase & 0xff000000 ) == 0x08000000 )
+			{
+				int sector = ( chunk_to_erase & 0x00ffffff ) / iss->sector_size;
+				if( sector < MAX_FLASH_SECTORS )
+				{
+					iss->flash_sector_status[sector] = 1;
+				}
+			}
+
 			// Step 4:  set PAGE_ER of FLASH_CTLR(0x40022010)
-			MCF.WriteWord( dev, (intptr_t)&FLASH->CTLR, CR_PAGE_ER ); // Actually FTER
+			if( MCF.WriteWord( dev, (intptr_t)&FLASH->CTLR, CR_PAGE_ER ) ) goto flashoperr; // Actually FTER
 			// Step 5: Write the first address of the fast erase page to the FLASH_ADDR register.
-			MCF.WriteWord( dev, (intptr_t)&FLASH->ADDR, chunk_to_erase  );
-
+			if( MCF.WriteWord( dev, (intptr_t)&FLASH->ADDR, chunk_to_erase ) ) goto flashoperr;
 			// Step 6: Set the STAT bit of FLASH_CTLR register to '1' to initiate a fast page erase (64 bytes) action.
-			MCF.WriteWord( dev, (intptr_t)&FLASH->CTLR, CR_STRT_Set | CR_PAGE_ER );
+			if( MCF.PrepForLongOp ) MCF.PrepForLongOp( dev );  // Give the programmer a headsup this next operation could take a while.
+			if( MCF.WriteWord( dev, (intptr_t)&FLASH->CTLR, CR_STRT_Set | CR_PAGE_ER ) ) goto flashoperr;
 			if( MCF.WaitForFlash && MCF.WaitForFlash( dev ) ) return -99;
-			chunk_to_erase+=64;
+			chunk_to_erase+=iss->sector_size;
 		}
 	}
+
 	return 0;
+flashoperr:
+	fprintf( stderr, "Error: Flash operation error\n" );
+	return -93;
 }
 
 int DefaultReadBinaryBlob( void * dev, uint32_t address_to_read_from, uint32_t read_size, uint8_t * blob )
@@ -1598,28 +1614,26 @@ static int DefaultHaltMode( void * dev, int mode )
 	struct InternalState * iss = (struct InternalState*)(((struct ProgrammerStructBase*)dev)->internal);
 	switch ( mode )
 	{
-	case 5: // Don't reboot.
-	case 0:
+	case HALT_MODE_HALT_BUT_NO_RESET: // Don't reboot.
+	case HALT_MODE_HALT_AND_RESET:
 		MCF.WriteReg32( dev, DMSHDWCFGR, 0x5aa50000 | (1<<10) ); // Shadow Config Reg
 		MCF.WriteReg32( dev, DMCFGR, 0x5aa50000 | (1<<10) ); // CFGR (1<<10 == Allow output from slave)
 		MCF.WriteReg32( dev, DMCFGR, 0x5aa50000 | (1<<10) ); // Bug in silicon?  If coming out of cold boot, and we don't do our little "song and dance" this has to be called.
-
 		MCF.WriteReg32( dev, DMCONTROL, 0x80000001 ); // Make the debug module work properly.
 		if( mode == 0 ) MCF.WriteReg32( dev, DMCONTROL, 0x80000003 ); // Reboot.
 		MCF.WriteReg32( dev, DMCONTROL, 0x80000001 ); // Re-initiate a halt request.
-
 //		MCF.WriteReg32( dev, DMCONTROL, 0x00000001 ); // Clear Halt Request.  This is recommended, but not doing it seems more stable.
 		// Sometimes, even if the processor is halted but the MSB is clear, it will spuriously start?
 		MCF.FlushLLCommands( dev );
 		break;
-	case 1:
+	case HALT_MODE_REBOOT:
 		MCF.WriteReg32( dev, DMCONTROL, 0x80000001 ); // Make the debug module work properly.
 		MCF.WriteReg32( dev, DMCONTROL, 0x80000001 ); // Initiate a halt request.
 		MCF.WriteReg32( dev, DMCONTROL, 0x80000003 ); // Reboot.
 		MCF.WriteReg32( dev, DMCONTROL, 0x40000001 ); // resumereq
 		MCF.FlushLLCommands( dev );
 		break;
-	case 2:
+	case HALT_MODE_RESUME:
 		MCF.WriteReg32( dev, DMSHDWCFGR, 0x5aa50000 | (1<<10) ); // Shadow Config Reg
 		MCF.WriteReg32( dev, DMCFGR, 0x5aa50000 | (1<<10) ); // CFGR (1<<10 == Allow output from slave)
 		MCF.WriteReg32( dev, DMCFGR, 0x5aa50000 | (1<<10) ); // Bug in silicon?  If coming out of cold boot, and we don't do our little "song and dance" this has to be called.
@@ -1627,7 +1641,7 @@ static int DefaultHaltMode( void * dev, int mode )
 		MCF.WriteReg32( dev, DMCONTROL, 0x40000001 ); // resumereq
 		MCF.FlushLLCommands( dev );
 		break;
-	case 3:
+	case HALT_MODE_GO_TO_BOOTLOADER:
 		MCF.WriteReg32( dev, DMCONTROL, 0x80000001 ); // Make the debug module work properly.
 		MCF.WriteReg32( dev, DMCONTROL, 0x80000001 ); // Initiate a halt request.
 
@@ -1645,29 +1659,24 @@ static int DefaultHaltMode( void * dev, int mode )
 	default:
 		fprintf( stderr, "Error: Unknown halt mode %d\n", mode );
 	}
-#if 0
-	int i;
-	for( i = 0; i < 100; i++ )
-	{
-		uint32_t temp = 0;
-		MCF.ReadReg32( dev, DMSTATUS, &temp );
-		fprintf( stderr, "DMSTATUS: %08x\n", temp );
-		usleep( 20000);
-	}
-#endif
 
+	iss->flash_unlocked = 0;
 	iss->processor_in_mode = mode;
+
+	// In case processor halt process needs to complete, i.e. if it was in the middle of a flash op.
+	MCF.DelayUS( dev, 3000 );
+
 	return 0;
 }
 
-// Returns positive if received text.
+// Returns positive if received text, or request for input.
+// Returns -1 if nothing was printed but received data.
 // Returns negative if error.
 // Returns 0 if no text waiting.
 // maxlen MUST be at least 8 characters.  We null terminate.
 int DefaultPollTerminal( void * dev, uint8_t * buffer, int maxlen, uint32_t leaveflagA, int leaveflagB )
 {
 	struct InternalState * iss = (struct InternalState*)(((struct ProgrammerStructBase*)dev)->internal);
-
 	int r;
 	uint32_t rr;
 	if( iss->statetag != STTAG( "TERM" ) )
@@ -1677,16 +1686,13 @@ int DefaultPollTerminal( void * dev, uint8_t * buffer, int maxlen, uint32_t leav
 	}
 	r = MCF.ReadReg32( dev, DMDATA0, &rr );
 	if( r < 0 ) return r;
-
 	if( maxlen < 8 ) return -9;
 
 	// DMDATA1:
 	//  bit  7 = host-acknowledge.
 	if( rr & 0x80 )
 	{
-		int ret = 0;
 		int num_printf_chars = (rr & 0xf)-4;
-
 		if( num_printf_chars > 0 && num_printf_chars <= 7)
 		{
 			if( num_printf_chars > 3 )
@@ -1699,11 +1705,12 @@ int DefaultPollTerminal( void * dev, uint8_t * buffer, int maxlen, uint32_t leav
 			if( firstrem > 3 ) firstrem = 3;
 			memcpy( buffer, ((uint8_t*)&rr)+1, firstrem );
 			buffer[num_printf_chars] = 0;
-			ret = num_printf_chars;
 		}
 		if( leaveflagA ) MCF.WriteReg32( dev, DMDATA1, leaveflagB );
 		MCF.WriteReg32( dev, DMDATA0, leaveflagA ); // Write that we acknowledge the data.
-		return ret;
+		if( num_printf_chars == 0 ) return -1;      // was acked?
+		if( num_printf_chars < 0 ) num_printf_chars = 0;
+		return num_printf_chars;
 	}
 	else
 	{
@@ -1713,10 +1720,9 @@ int DefaultPollTerminal( void * dev, uint8_t * buffer, int maxlen, uint32_t leav
 
 int DefaultUnbrick( void * dev )
 {
-	// TODO: Why doesn't this work on the ESP32S2?
-
 	printf( "Entering Unbrick Mode\n" );
 	MCF.Control3v3( dev, 0 );
+
 	MCF.DelayUS( dev, 60000 );
 	MCF.DelayUS( dev, 60000 );
 	MCF.DelayUS( dev, 60000 );
@@ -1754,16 +1760,7 @@ int DefaultUnbrick( void * dev )
 	// After more experimentation, it appaers to work best by not clearing the halt request.
 
 	MCF.FlushLLCommands( dev );
-	if( MCF.DelayUS )
-		MCF.DelayUS( dev, 20000 );
-	else
-	{
-#if defined(WINDOWS) || defined(WIN32) || defined(_WIN32)
-		Sleep(20);
-#else
-		usleep(20000);
-#endif
-	}
+	MCF.DelayUS( dev, 20000 );
 
 	if( timeout == max_timeout ) 
 	{
@@ -1779,138 +1776,6 @@ int DefaultConfigureNRSTAsGPIO( void * dev, int one_if_yes_gpio  )
 {
 	fprintf( stderr, "Error: DefaultConfigureNRSTAsGPIO does not work via the programmer here.  Please see the demo \"optionbytes\"\n" );
 	return -5;
-#if 0
-	int ret = 0;
-	uint32_t csw;
-
-
-	if( MCF.ReadWord( dev, 0x1FFFF800, &csw ) )
-	{
-		fprintf( stderr, "Error: failed to get user word\n" );
-		return -5;
-	}
-
-	printf( "CSW WAS : %08x\n", csw );
-
-	MCF.WriteWord( dev, 0x40022008, 0x45670123 ); // OBKEYR = 0x40022008
-	MCF.WriteWord( dev, 0x40022008, 0xCDEF89AB );
-	MCF.WriteWord( dev, 0x40022004, 0x45670123 ); // FLASH->KEYR = 0x40022004
-	MCF.WriteWord( dev, 0x40022004, 0xCDEF89AB );
-	MCF.WriteWord( dev, 0x40022024, 0x45670123 ); // MODEKEYR = 0x40022024
-	MCF.WriteWord( dev, 0x40022024, 0xCDEF89AB );
-
-//XXXX THIS DOES NOT WORK IT CANNOT ERASE.
-	uint32_t ctlr;
-	if( MCF.ReadWord( dev, 0x40022010, &ctlr ) ) // FLASH->CTLR = 0x40022010
-	{
-		return -9;
-	}
-	ctlr |= CR_OPTER_Set | CR_STRT_Set; // OBER
-	MCF.WriteWord( dev, 0x40022010, ctlr ); // FLASH->CTLR = 0x40022010
-	ret |= MCF.WaitForDoneOp( dev, 0 );
-	ret |= MCF.WaitForFlash( dev );
-
-	MCF.WriteHalfWord( dev, (intptr_t)&OB->RDPR, RDP_Key );
-
-    ctlr &=~CR_OPTER_Reset;
-	MCF.WriteWord( dev, 0x40022010, ctlr ); // FLASH->CTLR = 0x40022010
-	ret |= MCF.WaitForDoneOp( dev, 0 );
-	ret |= MCF.WaitForFlash( dev );
-    ctlr |= CR_OPTPG_Set;
-	MCF.WriteWord( dev, 0x40022010, ctlr ); // FLASH->CTLR = 0x40022010
-	ret |= MCF.WaitForDoneOp( dev, 0 );
-	ret |= MCF.WaitForFlash( dev );
-    ctlr &=~CR_OPTPG_Reset;
-	MCF.WriteWord( dev, 0x40022010, ctlr ); // FLASH->CTLR = 0x40022010
-	ret |= MCF.WaitForDoneOp( dev, 0 );
-	ret |= MCF.WaitForFlash( dev );
-
-
-// This does work to write the option bytes, but does NOT work to erase.
-
-	if( MCF.ReadWord( dev, 0x40022010, &ctlr ) ) // FLASH->CTLR = 0x40022010
-	{
-		return -9;
-	}
-	ctlr |= CR_OPTPG_Set; //OBPG
-	MCF.WriteWord( dev, 0x40022010, ctlr ); // FLASH->CTLR = 0x40022010
-	ret |= MCF.WaitForDoneOp( dev, 0 );
-	ret |= MCF.WaitForFlash( dev );
-
-	uint32_t config = OB_IWDG_HW | OB_STOP_NoRST | OB_STDBY_NoRST | (one_if_yes_gpio?OB_RST_NoEN:OB_RST_EN_DT1ms) | (uint16_t)0xE0;
-	printf( "Config (%08x): %08x\n", (intptr_t)&OB->USER, config );
-	MCF.WriteHalfWord( dev,  (intptr_t)&OB->USER, config );
-
-	ret |= MCF.WaitForDoneOp( dev, 0 );
-	ret |= MCF.WaitForFlash( dev );
-
-	ctlr &= CR_OPTPG_Reset;
-	MCF.WriteWord( dev, 0x40022010, ctlr ); // FLASH->CTLR = 0x40022010
-
-
-	if( MCF.ReadWord( dev, 0x1FFFF800, &csw ) )
-	{
-		fprintf( stderr, "Error: failed to get user word\n" );
-		return -5;
-	}
-
-	//csw >>= 16; // Only want bottom part of word.
-	printf( "CSW: %08x\n", csw );
-
-#if 0
-	uint32_t prevuser;
-	if( MCF.ReadWord( dev, 0x1FFFF800, &prevuser ) )
-	{
-		fprintf( stderr, "Error: failed to get user word\n" );
-		return -5;
-	}
-
-	ret |= MCF.WaitForFlash( dev );
-
-	// Erase.
-	MCF.ReadWord( dev, 0x40022010, &csw ); // FLASH->CTLR = 0x40022010
-	csw |= 1<<5;//OBER;
-	MCF.WriteWord( dev, 0x40022010, csw ); // FLASH->CTLR = 0x40022010
-	MCF.WriteHalfWord( dev, 0x1FFFF802, 0xffff );
-	ret |= MCF.WaitForDoneOp( dev, 0 );
-	ret |= MCF.WaitForFlash( dev );
-
-	MCF.ReadWord( dev, 0x40022010, &csw ); // FLASH->CTLR = 0x40022010
-	printf( "CTLR: %08x\n", csw );
-	csw |= 1<<9;//OBPG, OBWRE
-	MCF.WriteWord( dev, 0x40022010, csw );
-
-	int j;
-	for( j = 0; j < 5; j++ )
-	{
-		if( MCF.ReadWord( dev, 0x1FFFF800, &prevuser ) )
-		{
-			fprintf( stderr, "Error: failed to get user word\n" );
-			return -5;
-		}
-
-		//csw >>= 16; // Only want bottom part of word.
-		printf( "CSW was: %08x\n", prevuser );
-		csw = prevuser >> 16;
-		csw = csw & 0xe7e7;
-		csw |= (one_if_yes_gpio?0b11:0b00)<<(3+0);
-		csw |= (one_if_yes_gpio?0b00:0b11)<<(3+8);
-		printf( "CSW writing: %08x\n", csw );
-		MCF.WriteHalfWord( dev, 0x1FFFF802, csw );
-		ret |= MCF.WaitForDoneOp( dev, 0 );
-		ret |= MCF.WaitForFlash( dev );
-	}
-
-
-	MCF.ReadWord( dev, 0x40022010, &csw ); // FLASH->CTLR = 0x40022010
-	printf( "CTLR: %08x\n", csw );
-	csw &= ~(1<<9);//OBPG, OBWRE
-	MCF.WriteWord( dev, 0x40022010, csw );
-
-#endif
-	printf( "RET: %d\n", ret );
-	return 0;
-#endif
 }
 
 int DefaultConfigureReadProtection( void * dev, int one_if_yes_protect  )
@@ -1922,7 +1787,7 @@ int DefaultConfigureReadProtection( void * dev, int one_if_yes_protect  )
 int DefaultPrintChipInfo( void * dev )
 {
 	uint32_t reg;
-	MCF.HaltMode( dev, 5 );
+	MCF.HaltMode( dev, HALT_MODE_HALT_BUT_NO_RESET );
 	
 	if( MCF.ReadWord( dev, 0x1FFFF800, &reg ) ) goto fail;	
 	printf( "USER/RDPR  : %04x/%04x\n", reg>>16, reg&0xFFFF );
@@ -1953,10 +1818,20 @@ int DefaultVoidHighLevelState( void * dev )
 	return 0;
 }
 
+int DefaultDelayUS( void * dev, int us )
+{
+#if defined(WINDOWS) || defined(WIN32) || defined(_WIN32)
+	Sleep( (us+9999) / 1000 );
+#else
+	usleep( us );
+#endif
+	return 0;
+}
+
 int SetupAutomaticHighLevelFunctions( void * dev )
 {
 	// Will populate high-level functions from low-level functions.
-	if( MCF.WriteReg32 == 0 || MCF.ReadReg32 == 0 ) return -5;
+	if( MCF.WriteReg32 == 0 && MCF.ReadReg32 == 0 && MCF.WriteWord == 0 ) return -5;
 
 	// Else, TODO: Build the high level functions from low level functions.
 	// If a high-level function alrady exists, don't override.
@@ -2007,12 +1882,9 @@ int SetupAutomaticHighLevelFunctions( void * dev )
 		MCF.ConfigureNRSTAsGPIO = DefaultConfigureNRSTAsGPIO;
 	if( !MCF.VoidHighLevelState )
 		MCF.VoidHighLevelState = DefaultVoidHighLevelState;
+	if( !MCF.DelayUS )
+		MCF.DelayUS = DefaultDelayUS;
 
-	struct InternalState * iss = calloc( 1, sizeof( struct InternalState ) );
-	iss->ram_base = 0x20000000;
-	iss->ram_size = 2048;
-
-	((struct ProgrammerStructBase*)dev)->internal = iss;
 	return 0;
 }
 
@@ -2069,3 +1941,5 @@ void TestFunction(void * dev )
 	}
 }
 
+
+
diff --git a/minichlink/minichlink.exe b/minichlink/minichlink.exe
index b50eaf2bd5a237af60e010f42e93dabd64038bb8..41c7a0961ea714c1698c7f42a519ef9283b9b923 100644
Binary files a/minichlink/minichlink.exe and b/minichlink/minichlink.exe differ
diff --git a/minichlink/minichlink.h b/minichlink/minichlink.h
index 131a433c4183202c4ac8787c30ae89f13b79ea20..1433f6d05ea853f454ae6fba16599c65a27915a5 100644
--- a/minichlink/minichlink.h
+++ b/minichlink/minichlink.h
@@ -50,18 +50,15 @@ struct MiniChlinkFunctions
 
 	int (*SetEnableBreakpoints)( void * dev, int halt_on_break, int single_step );
 
+	int (*PrepForLongOp)( void * dev ); // Called before the command that will take a while.
 	int (*WaitForFlash)( void * dev );
 	int (*WaitForDoneOp)( void * dev, int ignore );
 
 	int (*PrintChipInfo)( void * dev );
 
-	// Geared for flash, but could be anything.
+	// Geared for flash, but could be anything.  Note: If in flash, must also erase.
 	int (*BlockWrite64)( void * dev, uint32_t address_to_write, uint8_t * data );
 
-	// TODO: What about 64-byte block-reads?
-	// TODO: What about byte read/write?
-	// TODO: What about half read/write?
-
 	// Returns positive if received text.
 	// Returns negative if error.
 	// Returns 0 if no text waiting.
@@ -86,6 +83,14 @@ struct MiniChlinkFunctions
 	FlushLLCommands
 */
 
+inline static int IsAddressFlash( uint32_t addy ) { return ( addy & 0xff000000 ) == 0x08000000 || ( addy & 0x1FFFF800 ) == 0x1FFFF000; }
+
+#define HALT_MODE_HALT_AND_RESET    0
+#define HALT_MODE_REBOOT            1
+#define HALT_MODE_RESUME            2
+#define HALT_MODE_GO_TO_BOOTLOADER  3
+#define HALT_MODE_HALT_BUT_NO_RESET 5
+
 // Convert a 4-character string to an int.
 #define STTAG( x ) (*((uint32_t*)(x)))
 
@@ -97,6 +102,8 @@ struct ProgrammerStructBase
 	// You can put other things here.
 };
 
+#define MAX_FLASH_SECTORS 262144
+
 struct InternalState
 {
 	uint32_t statetag;
@@ -107,6 +114,10 @@ struct InternalState
 	int autoincrement;
 	uint32_t ram_base;
 	uint32_t ram_size;
+	int sector_size;
+	int flash_size;
+	int target_chip_type; // 0 for unknown (or 003), otherwise a part number.
+	uint8_t flash_sector_status[MAX_FLASH_SECTORS];  // 0 means unerased/unknown. 1 means erased.
 };
 
 
@@ -143,14 +154,23 @@ struct InternalState
 	#define DLLDECORATE
 #endif
 
-void * MiniCHLinkInitAsDLL(struct MiniChlinkFunctions ** MCFO) DLLDECORATE;
-extern struct MiniChlinkFunctions MCF;
+/* initialization hints for init functions */
+/* could be expanded with more in the future (e.g., PID/VID hints, priorities, ...)*/
+/* not all init functions currently need these hints. */
+typedef struct {
+	const char * serial_port;
+	const char * specific_programmer;
+} init_hints_t;
 
+void * MiniCHLinkInitAsDLL(struct MiniChlinkFunctions ** MCFO, const init_hints_t* init_hints) DLLDECORATE;
+extern struct MiniChlinkFunctions MCF;
 
 // Returns 'dev' on success, else 0.
-void * TryInit_WCHLinkE();
-void * TryInit_ESP32S2CHFUN();
+void * TryInit_WCHLinkE(void);
+void * TryInit_ESP32S2CHFUN(void);
 void * TryInit_NHCLink042(void);
+void * TryInit_B003Fun(void);
+void * TryInit_Ardulink(const init_hints_t*);
 
 // Returns 0 if ok, populated, 1 if not populated.
 int SetupAutomaticHighLevelFunctions( void * dev );
@@ -160,6 +180,10 @@ int64_t SimpleReadNumberInt( const char * number, int64_t defaultNumber );
 
 // For drivers to call
 int DefaultVoidHighLevelState( void * dev );
+int InternalUnlockBootloader( void * dev );
+int InternalIsMemoryErased( struct InternalState * iss, uint32_t address );
+void InternalMarkMemoryNotErased( struct InternalState * iss, uint32_t address );
+int InternalUnlockFlash( void * dev, struct InternalState * iss );
 
 // GDBSever Functions
 int SetupGDBServer( void * dev );
diff --git a/minichlink/pgm-b003fun.c b/minichlink/pgm-b003fun.c
new file mode 100644
index 0000000000000000000000000000000000000000..a1681087a842e39d941c19a61872bb26cf8b13de
--- /dev/null
+++ b/minichlink/pgm-b003fun.c
@@ -0,0 +1,777 @@
+#include <stdint.h>
+#include "hidapi.h"
+#include "minichlink.h"
+#include <string.h>
+#include <stdlib.h>
+#include <stdio.h>
+
+#include "../ch32v003fun/ch32v003fun.h"
+
+//#define DEBUG_B003
+
+#if defined(WINDOWS) || defined(WIN32) || defined(_WIN32)
+void Sleep(uint32_t dwMilliseconds);
+#define usleep( x ) Sleep( x / 1000 );
+#else
+#include <unistd.h>
+#endif
+
+
+struct B003FunProgrammerStruct
+{
+	void * internal; // Part of struct ProgrammerStructBase 
+
+	hid_device * hd;
+	uint8_t commandbuffer[128];
+	uint8_t respbuffer[128];
+	int commandplace;
+	int prepping_for_erase;
+};
+
+static const unsigned char byte_wise_read_blob[] = { // No alignment restrictions.
+	0x23, 0xa0, 0x05, 0x00, 0x13, 0x07, 0x45, 0x03, 0x0c, 0x43, 0x50, 0x43,
+	0x2e, 0x96, 0x21, 0x07, 0x94, 0x21, 0x14, 0xa3, 0x85, 0x05, 0x05, 0x07,
+	0xe3, 0xcc, 0xc5, 0xfe, 0x93, 0x06, 0xf0, 0xff, 0x14, 0xc1, 0x82, 0x80,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+
+static const unsigned char half_wise_read_blob[] = {  // size and address must be aligned by 2.
+	0x23, 0xa0, 0x05, 0x00, 0x13, 0x07, 0x45, 0x03, 0x0c, 0x43, 0x50, 0x43,
+	0x2e, 0x96, 0x21, 0x07, 0x96, 0x21, 0x16, 0xa3, 0x89, 0x05, 0x09, 0x07,
+	0xe3, 0xcc, 0xc5, 0xfe, 0x93, 0x06, 0xf0, 0xff, 0x14, 0xc1, 0x82, 0x80,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+
+static const unsigned char word_wise_read_blob[] = { // size and address must be aligned by 4.
+	0x23, 0xa0, 0x05, 0x00, 0x13, 0x07, 0x45, 0x03, 0x0c, 0x43, 0x50, 0x43,
+	0x2e, 0x96, 0x21, 0x07, 0x94, 0x41, 0x14, 0xc3, 0x91, 0x05, 0x11, 0x07,
+	0xe3, 0xcc, 0xc5, 0xfe, 0x93, 0x06, 0xf0, 0xff, 0x14, 0xc1, 0x82, 0x80,
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+
+static const unsigned char word_wise_write_blob[] = { // size and address must be aligned by 4.
+	0x23, 0xa0, 0x05, 0x00, 0x13, 0x07, 0x45, 0x03, 0x0c, 0x43, 0x50, 0x43,
+	0x2e, 0x96, 0x21, 0x07, 0x14, 0x43, 0x94, 0xc1, 0x91, 0x05, 0x11, 0x07,
+	0xe3, 0xcc, 0xc5, 0xfe, 0x93, 0x06, 0xf0, 0xff, 0x14, 0xc1, 0x82, 0x80, // NOTE: No readback!
+	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+/*
+	0x23, 0xa0, 0x05, 0x00, 0x13, 0x07, 0x45, 0x03, 0x0c, 0x43, 0x50, 0x43,
+	0x2e, 0x96, 0x21, 0x07, 0x14, 0x43, 0x94, 0xc1, 0x94, 0x41, 0x14, 0xc3, // With readback.
+	0x91, 0x05, 0x11, 0x07, 0xe3, 0xca, 0xc5, 0xfe, 0x93, 0x06, 0xf0, 0xff,
+	0x14, 0xc1, 0x82, 0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 */
+};
+
+static const unsigned char write64_flash[] = { // size and address must be aligned by 4.
+  0x13, 0x07, 0x45, 0x03, 0x0c, 0x43, 0x13, 0x86, 0x05, 0x04, 0x5c, 0x43,
+  0x8c, 0xc7, 0x14, 0x47, 0x94, 0xc1, 0xb7, 0x06, 0x05, 0x00, 0xd4, 0xc3,
+  0x94, 0x41, 0x91, 0x05, 0x11, 0x07, 0xe3, 0xc8, 0xc5, 0xfe, 0xc1, 0x66,
+  0x93, 0x86, 0x06, 0x04, 0xd4, 0xc3, 0xfd, 0x56, 0x14, 0xc1, 0x82, 0x80
+};
+
+static const unsigned char half_wise_write_blob[] = { // size and address must be aligned by 2
+	0x23, 0xa0, 0x05, 0x00, 0x13, 0x07, 0x45, 0x03, 0x0c, 0x43, 0x50, 0x43,
+	0x2e, 0x96, 0x21, 0x07, 0x16, 0x23, 0x96, 0xa1, 0x96, 0x21, 0x16, 0xa3,
+	0x89, 0x05, 0x09, 0x07, 0xe3, 0xca, 0xc5, 0xfe, 0x93, 0x06, 0xf0, 0xff,
+	0x14, 0xc1, 0x82, 0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+
+static const unsigned char byte_wise_write_blob[] = { // no division requirements.
+	0x23, 0xa0, 0x05, 0x00, 0x13, 0x07, 0x45, 0x03, 0x0c, 0x43, 0x50, 0x43,
+	0x2e, 0x96, 0x21, 0x07, 0x14, 0x23, 0x94, 0xa1, 0x94, 0x21, 0x14, 0xa3,
+	0x85, 0x05, 0x05, 0x07, 0xe3, 0xca, 0xc5, 0xfe, 0x93, 0x06, 0xf0, 0xff,
+	0x14, 0xc1, 0x82, 0x80, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00
+};
+
+// Just set the countdown to 0 to avoid any issues.
+//   li a3, 0; sw a3, 0(a1); li a3, -1; sw a3, 0(a0); ret;
+static const unsigned char halt_wait_blob[] = {
+	0x81, 0x46, 0x94, 0xc1, 0xfd, 0x56, 0x14, 0xc1, 0x82, 0x80 };
+
+// Set the countdown to -1 to cause main system to execute.
+//   li a3, -1; sw a3, 0(a1); li a3, -1; sw a3, 0(a0); ret;
+//static const unsigned char run_app_blob[] = {
+//	0xfd, 0x56, 0x94, 0xc1, 0xfd, 0x56, 0x14, 0xc1, 0x82, 0x80 };
+//
+// Alternatively, we do it ourselves.
+static const unsigned char run_app_blob[] = {
+	0x37, 0x07, 0x67, 0x45, 0xb7, 0x27, 0x02, 0x40, 0x13, 0x07, 0x37, 0x12,
+	0x98, 0xd7, 0x37, 0x97, 0xef, 0xcd, 0x13, 0x07, 0xb7, 0x9a, 0x98, 0xd7,
+	0x23, 0xa6, 0x07, 0x00, 0x13, 0x07, 0x00, 0x08, 0x98, 0xcb, 0xb7, 0xf7,
+	0x00, 0xe0, 0x37, 0x07, 0x00, 0x80, 0x23, 0xa8, 0xe7, 0xd0, 0x82, 0x80,
+};
+
+
+static void ResetOp( struct B003FunProgrammerStruct * eps )
+{
+	memset( eps->commandbuffer, 0, sizeof( eps->commandbuffer ) );
+	memcpy( eps->commandbuffer, "\xaa\x00\x00\x00", 4 );
+	eps->commandplace = 4;
+}
+
+static void WriteOp4( struct B003FunProgrammerStruct * eps, uint32_t opsend )
+{
+	int place = eps->commandplace;
+	int newend = place + 4;
+	if( newend < sizeof( eps->commandbuffer ) )
+	{
+		memcpy( eps->commandbuffer + place, &opsend, 4 );
+	}
+	eps->commandplace = newend;
+}
+
+
+static void WriteOpArb( struct B003FunProgrammerStruct * eps, const uint8_t * data, int len )
+{
+	int place = eps->commandplace;
+	int newend = place + len;
+	if( newend < sizeof( eps->commandbuffer ) )
+	{
+		memcpy( eps->commandbuffer + place, data, len );
+	}
+	eps->commandplace = newend;
+}
+
+static int CommitOp( struct B003FunProgrammerStruct * eps )
+{
+	int retries = 0;
+	int r;
+
+	uint32_t magic_go = 0x1234abcd;
+	memcpy( eps->commandbuffer + 124, &magic_go, 4 );
+
+	#ifdef DEBUG_B003
+	{
+		int i;
+		printf( "Commit TX: %lu bytes\n", sizeof(eps->commandbuffer)  );
+		for( i = 0; i < sizeof(eps->commandbuffer) ; i++ )
+		{
+			printf( "%02x ", eps->commandbuffer[i] );
+			if( ( i & 0xf ) == 0xf ) printf( "\n" );
+		}
+		if( ( i & 0xf ) != 0xf ) printf( "\n" );
+	}
+	#endif
+
+resend:
+	r = hid_send_feature_report( eps->hd, eps->commandbuffer, sizeof(eps->commandbuffer) );
+	#ifdef DEBUG_B003
+	printf( "hid_send_feature_report = %d\n", r );
+	#endif
+	if( r < 0 )
+	{
+		fprintf( stderr, "Warning: Issue with hid_send_feature_report. Retrying\n" );
+		if( retries++ > 10 )
+			return r;
+		else
+			goto resend;
+	}
+
+
+	if( eps->prepping_for_erase )
+	{
+		usleep(4000);
+	}
+
+	int timeout = 0;
+
+	do
+	{
+		eps->respbuffer[0] = 0xaa;
+		r = hid_get_feature_report( eps->hd, eps->respbuffer, sizeof(eps->respbuffer) );
+
+		#ifdef DEBUG_B003
+		{
+			int i;
+			printf( "Commit RX: %d bytes\n", r );
+			for( i = 0; i < r; i++ )
+			{
+				printf( "%02x ", eps->respbuffer[i] );
+				if( ( i & 0xf ) == 0xf ) printf( "\n" );
+			}
+			if( ( i & 0xf ) != 0xf ) printf( "\n" );
+		}
+		#endif
+
+		if( r < 0 )
+		{
+			if( retries++ > 10 ) return r;
+			continue;
+		}
+
+		if( eps->respbuffer[1] == 0xff ) break;
+
+		if( timeout++ > 20 )
+		{
+			printf( "Error: Timed out waiting for stub to complete\n" );
+			return -99;
+		}
+	} while( 1 );
+	return 0;
+}
+
+static int B003FunFlushLLCommands( void * dev )
+{
+	// All commands are synchronous anyway.
+	return 0;
+}
+
+
+static int B003FunWaitForDoneOp( void * dev, int ignore )
+{
+	// It's synchronous, so no issue here.
+	return 0;
+}
+
+static int B003FunDelayUS( void * dev, int microseconds )
+{
+	usleep( microseconds );
+	return 0;
+}
+
+// Does not handle erasing
+static int InternalB003FunWriteBinaryBlob( void * dev, uint32_t address_to_write_to, uint32_t write_size, const uint8_t * blob )
+{
+	struct B003FunProgrammerStruct * eps = (struct B003FunProgrammerStruct *)dev;
+
+	int is_flash = IsAddressFlash( address_to_write_to );
+
+	if( ( address_to_write_to & 0x1 ) && write_size > 0 )
+	{
+		// Need to do byte-wise writing in front to line up with word alignment.
+		ResetOp( eps );
+		WriteOpArb( eps, byte_wise_write_blob, sizeof(byte_wise_write_blob) );
+		WriteOp4( eps, address_to_write_to ); // Base address to write.
+		WriteOp4( eps, 1 ); // write 1 bytes.
+		memcpy( &eps->commandbuffer[60], blob, 1 );
+		if( CommitOp( eps ) ) return -5;
+		if( is_flash && memcmp( &eps->commandbuffer[60], blob, 1 ) ) goto verifyfail;
+		blob++;
+		write_size --;
+		address_to_write_to++;
+	}
+	if( ( address_to_write_to & 0x2 ) && write_size > 1 )
+	{
+		// Need to do byte-wise writing in front to line up with word alignment.
+		ResetOp( eps );
+		WriteOpArb( eps, half_wise_write_blob, sizeof(half_wise_write_blob) );
+		WriteOp4( eps, address_to_write_to ); // Base address to write.
+		WriteOp4( eps, 2 ); // write 2 bytes.
+		memcpy( &eps->commandbuffer[60], blob, 2 );
+		if( CommitOp( eps ) ) return -5;
+		if( is_flash && memcmp( &eps->commandbuffer[60], blob, 2 ) ) goto verifyfail;
+		blob += 2;
+		write_size -= 2;
+		address_to_write_to+=2;
+	}
+	while( write_size > 3 )
+	{
+		int to_write_this_time = write_size & (~3);
+		if( to_write_this_time > 64 ) to_write_this_time = 64;
+
+		// Need to do byte-wise writing in front to line up with word alignment.
+		ResetOp( eps );
+		WriteOpArb( eps, word_wise_write_blob, sizeof(word_wise_write_blob) );
+		WriteOp4( eps, address_to_write_to ); // Base address to write.
+		WriteOp4( eps, to_write_this_time ); // write 4 bytes.
+		memcpy( &eps->commandbuffer[60], blob, to_write_this_time );
+		if( CommitOp( eps ) ) return -5;
+		if( is_flash && memcmp( &eps->commandbuffer[60], blob, to_write_this_time ) ) goto verifyfail;
+		blob += to_write_this_time;
+		write_size -= to_write_this_time;
+		address_to_write_to += to_write_this_time;
+	}
+	if( write_size > 1 )
+	{
+		// Need to do byte-wise writing in front to line up with word alignment.
+		ResetOp( eps );
+		WriteOpArb( eps, half_wise_write_blob, sizeof(half_wise_write_blob) );
+		WriteOp4( eps, address_to_write_to ); // Base address to write.
+		WriteOp4( eps, 2 ); // write 2 bytes.
+		memcpy( &eps->commandbuffer[60], blob, 2 );
+		if( CommitOp( eps ) ) return -5;
+		if( is_flash && memcmp( &eps->commandbuffer[60], blob, 2 ) ) goto verifyfail;
+		blob += 2;
+		write_size -= 2;
+		address_to_write_to += 2;
+	}
+	if( write_size )
+	{
+		// Need to do byte-wise writing in front to line up with word alignment.
+		ResetOp( eps );
+		WriteOpArb( eps, byte_wise_write_blob, sizeof(byte_wise_write_blob) );
+		WriteOp4( eps, address_to_write_to ); // Base address to write.
+		WriteOp4( eps, 1 ); // write 1 byte.
+		memcpy( &eps->commandbuffer[60], blob, 1 );
+		if( CommitOp( eps ) ) return -5;
+		if( is_flash && memcmp( &eps->commandbuffer[60], blob, 1 ) ) goto verifyfail;
+		blob += 1;
+		write_size -= 1;
+		address_to_write_to+=1;
+	}
+	eps->prepping_for_erase = 0;
+	return 0;
+verifyfail:
+	fprintf( stderr, "Error: Write Binary Blob: %d bytes to %08x\n", write_size, address_to_write_to );
+	return -6;
+}
+
+static int B003FunReadBinaryBlob( void * dev, uint32_t address_to_read_from, uint32_t read_size, uint8_t * blob )
+{
+	struct B003FunProgrammerStruct * eps = (struct B003FunProgrammerStruct *)dev;
+
+#ifdef DEBUG_B003
+	printf( "Read Binary Blob: %d bytes from %08x\n", read_size, address_to_read_from );
+#endif
+
+	if( ( address_to_read_from & 0x1 ) && read_size > 0 )
+	{
+		// Need to do byte-wise reading in front to line up with word alignment.
+		ResetOp( eps );
+		WriteOpArb( eps, byte_wise_read_blob, sizeof(byte_wise_read_blob) );
+		WriteOp4( eps, address_to_read_from ); // Base address to read.
+		WriteOp4( eps, 1 ); // Read 1 bytes.
+		if( CommitOp( eps ) ) return -5;
+		memcpy( blob, &eps->respbuffer[60], 1 );
+		blob++;
+		read_size --;
+		address_to_read_from++;
+	}
+	if( ( address_to_read_from & 0x2 ) && read_size > 1 )
+	{
+		// Need to do byte-wise reading in front to line up with word alignment.
+		ResetOp( eps );
+		WriteOpArb( eps, half_wise_read_blob, sizeof(half_wise_read_blob) );
+		WriteOp4( eps, address_to_read_from ); // Base address to read.
+		WriteOp4( eps, 2 ); // Read 2 bytes.
+		if( CommitOp( eps ) ) return -5;
+		memcpy( blob, &eps->respbuffer[60], 2 );
+		blob += 2;
+		read_size -= 2;
+		address_to_read_from+=2;
+	}
+	while( read_size > 3 )
+	{
+		int to_read_this_time = read_size & (~3);
+		if( to_read_this_time > 64 ) to_read_this_time = 64;
+
+		// Need to do byte-wise reading in front to line up with word alignment.
+		ResetOp( eps );
+		WriteOpArb( eps, word_wise_read_blob, sizeof(word_wise_read_blob) );
+		WriteOp4( eps, address_to_read_from ); // Base address to read.
+		WriteOp4( eps, to_read_this_time ); // Read 4 bytes.
+		if( CommitOp( eps ) ) return -5;
+		memcpy( blob, &eps->respbuffer[60], to_read_this_time );
+		blob += to_read_this_time;
+		read_size -= to_read_this_time;
+		address_to_read_from += to_read_this_time;
+	}
+	if( read_size > 1 )
+	{
+		// Need to do byte-wise reading in front to line up with word alignment.
+		ResetOp( eps );
+		WriteOpArb( eps, half_wise_read_blob, sizeof(half_wise_read_blob) );
+		WriteOp4( eps, address_to_read_from ); // Base address to read.
+		WriteOp4( eps, 2 ); // Read 2 bytes.
+		if( CommitOp( eps ) ) return -5;
+		memcpy( blob, &eps->respbuffer[60], 2 );
+		blob += 2;
+		read_size -= 2;
+		address_to_read_from += 2;
+	}
+	if( read_size )
+	{
+		// Need to do byte-wise reading in front to line up with word alignment.
+		ResetOp( eps );
+		WriteOpArb( eps, byte_wise_read_blob, sizeof(byte_wise_read_blob) );
+		WriteOp4( eps, address_to_read_from ); // Base address to read.
+		WriteOp4( eps, 1 ); // Read 1 byte.
+		if( CommitOp( eps ) ) return -5;
+		memcpy( blob, &eps->respbuffer[60], 1 );
+		blob += 1;
+		read_size -= 1;
+		address_to_read_from+=1;
+	}
+	return 0;
+}
+
+static int InternalB003FunBoot( void * dev )
+{
+	struct B003FunProgrammerStruct * eps = (struct B003FunProgrammerStruct*) dev;
+
+	printf( "Booting\n" );
+	ResetOp( eps );
+	WriteOpArb( eps, run_app_blob, sizeof(run_app_blob) );
+	if( CommitOp( eps ) ) return -5;
+	return 0;
+}
+
+static int B003FunSetupInterface( void * dev )
+{
+	struct B003FunProgrammerStruct * eps = (struct B003FunProgrammerStruct*) dev;
+	printf( "Halting Boot Countdown\n" );
+	ResetOp( eps );
+	WriteOpArb( eps, halt_wait_blob, sizeof(halt_wait_blob) );
+	if( CommitOp( eps ) ) return -5;
+	return 0;
+}
+
+static int B003FunExit( void * dev )
+{	
+	return 0;
+}
+
+// MUST be 4-byte-aligned.
+static int B003FunWriteWord( void * dev, uint32_t address_to_write, uint32_t data )
+{
+	return InternalB003FunWriteBinaryBlob( dev, address_to_write, 4, (uint8_t*)&data );
+}
+
+static int B003FunReadWord( void * dev, uint32_t address_to_read, uint32_t * data )
+{
+	return B003FunReadBinaryBlob( dev, address_to_read, 4, (uint8_t*)data );
+}
+
+static int B003FunBlockWrite64( void * dev, uint32_t address_to_write, uint8_t * data )
+{
+	struct B003FunProgrammerStruct * eps = (struct B003FunProgrammerStruct*) dev;
+	struct InternalState * iss = eps->internal;
+
+	if( IsAddressFlash( address_to_write ) )
+	{
+		if( !iss->flash_unlocked )
+		{
+			int rw;
+			if( ( rw = InternalUnlockFlash( dev, iss ) ) )
+				return rw;
+		}
+
+		if( !InternalIsMemoryErased( iss, address_to_write ) )
+		{
+			if( MCF.Erase( dev, address_to_write, 64, 0 ) )
+			{
+				fprintf( stderr, "Error: Failed to erase sector at %08x\n", address_to_write );
+				return -9;
+			}
+		}
+
+		// Not actually needed.
+		MCF.WriteWord( dev, 0x40022010, CR_PAGE_PG ); // (intptr_t)&FLASH->CTLR = 0x40022010
+		MCF.WriteWord( dev, 0x40022010, CR_PAGE_PG | CR_BUF_RST); // (intptr_t)&FLASH->CTLR = 0x40022010
+
+		ResetOp( eps );
+		WriteOpArb( eps, write64_flash, sizeof(write64_flash) );
+		WriteOp4( eps, address_to_write ); // Base address to write. @52
+		WriteOp4( eps, 0x4002200c ); // FLASH STATR base address. @ 56
+		memcpy( &eps->commandbuffer[60], data, 64 ); // @60
+		if( MCF.PrepForLongOp ) MCF.PrepForLongOp( dev );  // Give the programmer a headsup this next operation could take a while.
+		if( CommitOp( eps ) ) return -5;
+
+		// This is actually built-in.
+//		MCF.WriteWord( dev, 0x40022010, CR_PAGE_PG|CR_STRT_Set); // (intptr_t)&FLASH->CTLR = 0x40022010  (actually commit)
+	}
+	else
+	{
+		return InternalB003FunWriteBinaryBlob( dev, address_to_write, 64, data );
+	}
+
+	return 0;
+}
+
+static int B003FunWriteHalfWord( void * dev, uint32_t address_to_write, uint16_t data )
+{
+	return InternalB003FunWriteBinaryBlob( dev, address_to_write, 2, (uint8_t*)&data );
+}
+
+static int B003FunReadHalfWord( void * dev, uint32_t address_to_read, uint16_t * data )
+{
+	return B003FunReadBinaryBlob( dev, address_to_read, 2, (uint8_t*)data );
+}
+
+static int B003FunWriteByte( void * dev, uint32_t address_to_write, uint8_t data )
+{
+	return InternalB003FunWriteBinaryBlob( dev, address_to_write, 1, &data );
+}
+
+static int B003FunReadByte( void * dev, uint32_t address_to_read, uint8_t * data )
+{
+	return B003FunReadBinaryBlob( dev, address_to_read, 1, data );
+}
+
+
+static int B003FunHaltMode( void * dev, int mode )
+{
+	struct InternalState * iss = (struct InternalState*)(((struct ProgrammerStructBase*)dev)->internal);
+	switch ( mode )
+	{
+	case HALT_MODE_HALT_BUT_NO_RESET: // Don't reboot.
+	case HALT_MODE_HALT_AND_RESET:    // Reboot and halt
+		// This programmer is always halted anyway.
+		break;
+
+	case HALT_MODE_REBOOT:            // Actually boot?
+		InternalB003FunBoot( dev );
+		break;
+
+	case HALT_MODE_RESUME:
+		fprintf( stderr, "Warning: this programmer cannot resume\n" );
+		// We can't do this.
+		break;
+
+	case HALT_MODE_GO_TO_BOOTLOADER:
+		fprintf( stderr, "Warning: this programmer is already a bootloader.  Can't go into bootloader\n" );
+		break;
+
+	default:
+		fprintf( stderr, "Error: Unknown halt mode %d\n", mode );
+	}
+
+	iss->processor_in_mode = mode;
+	return 0;
+}
+
+
+int B003FunPrepForLongOp( void * dev )
+{
+	struct B003FunProgrammerStruct * d = (struct B003FunProgrammerStruct*)dev;
+	d->prepping_for_erase = 1;
+	return 0;
+}
+
+
+void * TryInit_B003Fun()
+{
+	#define VID 0x1209
+	#define PID 0xb003
+	hid_init();
+	hid_device * hd = hid_open( VID, PID, 0); // third parameter is "serial"
+	if( !hd ) return 0;
+
+	//extern int g_hidapiSuppress;
+	//g_hidapiSuppress = 1;  // Suppress errors for this device.  (don't do this yet)
+
+	struct B003FunProgrammerStruct * eps = malloc( sizeof( struct B003FunProgrammerStruct ) );
+	memset( eps, 0, sizeof( *eps ) );
+	eps->hd = hd;
+	eps->commandplace = 1;
+
+	memset( &MCF, 0, sizeof( MCF ) );
+	MCF.WriteReg32 = 0;
+	MCF.ReadReg32 = 0;
+	MCF.FlushLLCommands = B003FunFlushLLCommands;
+	MCF.DelayUS = B003FunDelayUS;
+	MCF.Control3v3 = 0;
+	MCF.SetupInterface = B003FunSetupInterface;
+	MCF.Exit = B003FunExit;
+	MCF.HaltMode = 0;
+	MCF.VoidHighLevelState = 0;
+	MCF.PollTerminal = 0;
+
+	// These are optional. Disabling these is a good mechanismto make sure the core functions still work.
+	MCF.WriteWord = B003FunWriteWord;
+	MCF.ReadWord = B003FunReadWord;
+
+	MCF.WriteHalfWord = B003FunWriteHalfWord;
+	MCF.ReadHalfWord = B003FunReadHalfWord;
+
+	MCF.WriteByte = B003FunWriteByte;
+	MCF.ReadByte = B003FunReadByte;
+
+	MCF.WaitForDoneOp = B003FunWaitForDoneOp;
+	MCF.BlockWrite64 = B003FunBlockWrite64;
+	MCF.ReadBinaryBlob = B003FunReadBinaryBlob;
+
+	MCF.PrepForLongOp = B003FunPrepForLongOp;
+
+	MCF.HaltMode = B003FunHaltMode;
+
+	return eps;
+}
+
+
+
+
+
+// Utility for generating bootloader code:
+
+// make rv003usb.bin &&  xxd -i -s 100 -l 44 rv003usb.bin
+
+/*
+// Read data, arbitrarily from memory. (byte-wise)
+. =  0x66
+	sw x0, 0(a1);       // Stop Countdown
+	addi a4, a0, 52;    // Start reading properties, starting from scratchpad + 52.
+	c.lw a1, 0(a4);     // Get starting address to read
+	c.lw a2, 4(a4);     // Get length to read.
+	c.add a2, a1        // a2 is now ending address.
+	c.addi a4, 8		// start writing back at byte 60.
+1:
+	XW_C_LBU(a3, a1, 0);	//lbu a3, 0(a1)       // Read from RAM
+	XW_C_SB(a3, a4, 0);		//sb a3, 0(a4)       // Store into scratchpad
+	c.addi a1, 1        // Advance pointers
+	c.addi a4, 1
+	blt a1, a2, 1b      // Loop til all read.
+	addi a3, x0, -1
+	sw a3, 0(a0)		// Write -1 into 0x00 indicating all done.
+	ret
+	.long 0,0,0,0,0,0,0
+*/
+
+/*
+// Read data, arbitrarily from memory. (half-wise)
+
+. =  0x66
+	sw x0, 0(a1);       // Stop Countdown
+	addi a4, a0, 52;    // Start reading properties, starting from scratchpad + 52.
+	c.lw a1, 0(a4);     // Get starting address to read
+	c.lw a2, 4(a4);     // Get length to read.
+	c.add a2, a1        // a2 is now ending address.
+	c.addi a4, 8		// start writing back at byte 60.
+1:
+	XW_C_LHU(a3, a1, 0);	//lhu a3, 0(a1)       // Read from RAM
+	XW_C_SH(a3, a4, 0);		//sh a3, 0(a4)       // Store into scratchpad
+	c.addi a1, 2        // Advance pointers
+	c.addi a4, 2
+	blt a1, a2, 1b      // Loop til all read.
+	addi a3, x0, -1
+	sw a3, 0(a0)		// Write -1 into 0x00 indicating all done.
+	ret
+	.long 0,0,0,0,0,0,0
+*/
+
+/*
+// Read data, arbitrarily from memory. (word-wise)
+. =  0x66
+	sw x0, 0(a1);       // Stop Countdown
+	addi a4, a0, 52;    // Start reading properties, starting from scratchpad + 52.
+	c.lw a1, 0(a4);     // Get starting address to read
+	c.lw a2, 4(a4);     // Get length to read.
+	c.add a2, a1        // a2 is now ending address.
+	c.addi a4, 8		// start writing back at byte 60.
+1:
+	lw a3, 0(a1);		//lw a3, 0(a1)       // Read from RAM
+	sw a3, 0(a4);		//sw a3, 0(a4)       // Store into scratchpad
+	c.addi a1, 4        // Advance pointers
+	c.addi a4, 4
+	blt a1, a2, 1b      // Loop til all read.
+	addi a3, x0, -1
+	sw a3, 0(a0)		// Write -1 into 0x00 indicating all done.
+	ret
+	.long 0,0,0,0,0,0,0
+*/
+/*
+// Write data, arbitrarily to memory. (word-wise)
+. =  0x66
+	sw x0, 0(a1);       // Stop Countdown
+	addi a4, a0, 52;    // Start reading properties, starting from scratchpad + 52.
+	c.lw a1, 0(a4);     // Get starting address to read
+	c.lw a2, 4(a4);     // Get length to read.
+	c.add a2, a1        // a2 is now ending address.
+	c.addi a4, 8		// start writing back at byte 60.
+1:
+	lw a3, 0(a4);		//lw a3, 0(a1)       // Read from RAM
+	sw a3, 0(a1);		//sw a3, 0(a4)       // Store into scratchpad
+	lw a3, 0(a1);       // Read-back
+	sw a3, 0(a4);
+	c.addi a1, 4        // Advance pointers
+	c.addi a4, 4
+	blt a1, a2, 1b      // Loop til all read.
+	addi a3, x0, -1
+	sw a3, 0(a0)		// Write -1 into 0x00 indicating all done.
+	ret
+	.long 0,0,0,0,0,0
+*/
+
+/*
+// Write data, arbitrarily to memory. (word-wise)
+. =  0x66
+	sw x0, 0(a1);       // Stop Countdown
+	addi a4, a0, 52;    // Start reading properties, starting from scratchpad + 52.
+	c.lw a1, 0(a4);     // Get starting address to read
+	c.lw a2, 4(a4);     // Get length to read.
+	c.add a2, a1        // a2 is now ending address.
+	c.addi a4, 8		// start writing back at byte 60.
+1:
+	XW_C_LHU(a3, a4, 0);	//lbu a3, 0(a4)       // Read from scratchpad
+	XW_C_SH(a3, a1, 0);		//sb a3, 0(a1)       // Store into RAM
+	XW_C_LHU(a3, a1, 0);	//lbu a3, 0(a4)       //  Read back
+	XW_C_SH(a3, a4, 0);		//sb a3, 0(a1) 
+	c.addi a1, 2        // Advance pointers
+	c.addi a4, 2
+	blt a1, a2, 1b      // Loop til all read.
+	addi a3, x0, -1
+	sw a3, 0(a0)		// Write -1 into 0x00 indicating all done.
+	ret
+	.long 0,0,0,0,0,0
+*/
+
+/*
+// Write data, arbitrarily to memory. (byte-wise)
+. =  0x66
+	sw x0, 0(a1);       // Stop Countdown
+	addi a4, a0, 52;    // Start reading properties, starting from scratchpad + 52.
+	c.lw a1, 0(a4);     // Get starting address to read
+	c.lw a2, 4(a4);     // Get length to read.
+	c.add a2, a1        // a2 is now ending address.
+	c.addi a4, 8		// start writing back at byte 60.
+1:
+	XW_C_LBU(a3, a4, 0);	//lbu a3, 0(a4)       // Read from scratchpad
+	XW_C_SB(a3, a1, 0);		//sb a3, 0(a1)       // Store into RAM
+	XW_C_LBU(a3, a1, 0);	//Read back
+	XW_C_SB(a3, a4, 0);
+	c.addi a1, 1        // Advance pointers
+	c.addi a4, 1
+	blt a1, a2, 1b      // Loop til all read.
+	addi a3, x0, -1
+	sw a3, 0(a0)		// Write -1 into 0x00 indicating all done.
+	ret
+	.long 0,0,0,0,0,0
+*/
+
+
+/* Run app blob
+				FLASH->BOOT_MODEKEYR = FLASH_KEY1;
+				FLASH->BOOT_MODEKEYR = FLASH_KEY2;
+				FLASH->STATR = 0; // 1<<14 is zero, so, boot user code.
+				FLASH->CTLR = CR_LOCK_Set;
+				PFIC->SCTLR = 1<<31;
+*/
+
+
+/* Write flash block 64.
+
+. =  0x66
+	addi a4, a0, 52;    // Start reading properties, starting from scratchpad + 52.
+	c.lw a1, 0(a4);     // a1 = Address to write to.
+	addi a2, a1, 64     // a2 = end of section to write to
+	c.lw a5, 4(a4);     // a5 = Get flash address (0x40022010)
+
+	// Must be done outside.
+//	li a3, 0x00080000 | 0x00010000;
+	//c.sw a3, 0(a5);  //FLASH->CTLR = CR_BUF_RST | CR_PAGE_PG
+	c.sw a1, 8(a5);     //FLASH->ADDR = writing location.
+
+	1:
+		c.lw a3, 8(a4);		//lw a3, 0(a1)       // Read from RAM (Starting @60)
+		c.sw a3, 0(a1);		//sw a3, 0(a4)       // Store into flash
+
+		li a3, 0x00010000 | 0x00040000;	// CR_PAGE_PG | FLASH_CTLR_BUF_LOAD
+		c.sw a3, 4(a5);     // Load into flash write buffer.
+
+		c.lw a3, 0(a1);		//Tricky: By reading from flash here, we force it to wait for completion.
+		c.addi a1, 4        // Advance pointers
+		c.addi a4, 4
+
+	//	// Wait for write to complete.
+	//	2:	c.lw a3, 0(a5)   // read FLASH->STATR 
+	//		c.andi a3, 1     // Mask off BUSY bit.
+	//		c.bnez a3, 2b
+
+
+		blt a1, a2, 1b      // Loop til all read.
+
+	li a3, 0x00010000 | 0x00000040 //CR_PAGE_PG|CR_STRT_Set
+	c.sw a3, 4(a5);     //FLASH->CTRL = CR_PAGE_PG|CR_STRT_Set
+	li a3, -1
+	c.sw a3, 0(a0)		// Write -1 into 0x00 indicating all done.
+	ret
+*/
diff --git a/minichlink/pgm-esp32s2-ch32xx.c b/minichlink/pgm-esp32s2-ch32xx.c
index ef8a721e7c3cbfbaf335c47bb353c666c2156d39..090da555ff10f8cd2b1945d671a6e4861f9aaf9d 100644
--- a/minichlink/pgm-esp32s2-ch32xx.c
+++ b/minichlink/pgm-esp32s2-ch32xx.c
@@ -12,6 +12,8 @@ struct ESP32ProgrammerStruct
 	int commandplace;
 	uint8_t reply[256];
 	int replylen;
+
+	int dev_version;
 };
 
 int ESPFlushLLCommands( void * dev );
@@ -243,12 +245,18 @@ int ESPBlockWrite64( void * dev, uint32_t address_to_write, uint8_t * data )
 
 retry:
 
-	Write2LE( eps, 0x0bfe );
+	if( eps->dev_version >= 2 && InternalIsMemoryErased( (struct InternalState*)eps->internal, address_to_write ) )
+		Write2LE( eps, 0x0efe );
+	else
+		Write2LE( eps, 0x0bfe );
 	Write4LE( eps, address_to_write );
+
 	int i;
 	int timeout = 0;
 	for( i = 0; i < 64; i++ ) Write1( eps, data[i] );
 
+	InternalMarkMemoryNotErased( (struct InternalState*)eps->internal, address_to_write );
+
 	do
 	{
 		ESPFlushLLCommands( dev );
@@ -409,6 +417,7 @@ void * TryInit_ESP32S2CHFUN()
 	memset( eps, 0, sizeof( *eps ) );
 	eps->hd = hd;
 	eps->commandplace = 1;
+	eps->dev_version = 0;
 
 	memset( &MCF, 0, sizeof( MCF ) );
 	MCF.WriteReg32 = ESPWriteReg32;
@@ -431,9 +440,16 @@ void * TryInit_ESP32S2CHFUN()
 
 	MCF.BlockWrite64 = ESPBlockWrite64;
 	MCF.VendorCommand = ESPVendorCommand;
+
 	// Reset internal programmer state.
 	Write2LE( eps, 0x0afe );
-
+	ESPFlushLLCommands( eps );
+	Write2LE( eps, 0xfefe );
+	ESPFlushLLCommands( eps );
+	if( eps->replylen > 1 )
+	{
+		eps->dev_version = eps->reply[1];
+	}
 	return eps;
 }
 
diff --git a/minichlink/pgm-wch-linke.c b/minichlink/pgm-wch-linke.c
index c07468846d57a5dd83deb1bbe79bc1b38c985322..6034f24bdd6128cca92b9f4da5296d497155f788 100644
--- a/minichlink/pgm-wch-linke.c
+++ b/minichlink/pgm-wch-linke.c
@@ -17,9 +17,9 @@ struct LinkEProgrammerStruct
 };
 
 // For non-ch32v003 chips.
-static int LEReadBinaryBlob( void * d, uint32_t offset, uint32_t amount, uint8_t * readbuff );
+//static int LEReadBinaryBlob( void * d, uint32_t offset, uint32_t amount, uint8_t * readbuff );
 static int InternalLinkEHaltMode( void * d, int mode );
-//static int LEWriteBinaryBlob( void * d, uint32_t address_to_write, uint32_t len, uint8_t * blob );
+static int LEWriteBinaryBlob( void * d, uint32_t address_to_write, uint32_t len, uint8_t * blob );
 
 #define WCHTIMEOUT 5000
 #define WCHCHECK(x) if( (status = x) ) { fprintf( stderr, "Bad USB Operation on " __FILE__ ":%d (%d)\n", __LINE__, status ); exit( status ); }
@@ -191,6 +191,7 @@ int LEFlushLLCommands( void * dev )
 static int LESetupInterface( void * d )
 {
 	libusb_device_handle * dev = ((struct LinkEProgrammerStruct*)d)->devh;
+	struct InternalState * iss = (struct InternalState*)(((struct ProgrammerStructBase*)d)->internal);
 	uint8_t rbuff[1024];
 	uint32_t transferred = 0;
 
@@ -223,6 +224,10 @@ static int LESetupInterface( void * d )
 	// TODO: What in the world is this?  It doesn't appear to be needed.
 	wch_link_command( dev, "\x81\x0c\x02\x09\x01", 5, 0, 0, 0 ); //Reply is: 820c0101
 
+	// Note from further debugging:
+	// My capture differs in this case: \x05 instead of \x09 -> But does not seem to be needed
+	//wch_link_command( dev, "\x81\x0c\x02\x05\x01", 5, 0, 0, 0 ); //Reply is: 820c0101
+
 	// This puts the processor on hold to allow the debugger to run.
 	wch_link_command( dev, "\x81\x0d\x01\x02", 4, (int*)&transferred, rbuff, 1024 ); // Reply: Ignored, 820d050900300500
 	if (rbuff[0] == 0x81 && rbuff[1] == 0x55 && rbuff[2] == 0x01 && rbuff[3] == 0x01)
@@ -230,14 +235,20 @@ static int LESetupInterface( void * d )
 		fprintf(stderr, "link error, nothing connected to linker\n");
 		return -1;
 	}
-        uint32_t target_chip_type = ( rbuff[4] << 4) + (rbuff[5] >> 4);
-        fprintf(stderr, "Chip Type: %03x\n", target_chip_type);
-        if( target_chip_type == 0x307 )
-        {
-                fprintf( stderr, "CH32V307 Detected.  Allowing old-flash-mode for operation.\n" );
-                //MCF.WriteBinaryBlob = LEWriteBinaryBlob;
-                MCF.ReadBinaryBlob = LEReadBinaryBlob;
-        }
+
+	uint32_t mcu_series = rbuff[4] << 4;
+    uint32_t target_chip_type = mcu_series + (rbuff[5] >> 4);
+	fprintf(stderr, "Chip Type: %03x\n", target_chip_type);
+
+	if( mcu_series == 0x300 || mcu_series == 0x200 )
+	{
+		fprintf( stderr, "CH32V30x or CH32V20x MCU detected. Using binary blob write for operation.\n" );
+		MCF.WriteBinaryBlob = LEWriteBinaryBlob;
+
+		iss->sector_size = 256;
+
+		wch_link_command( dev, "\x81\x0d\x01\x03", 4, (int*)&transferred, rbuff, 1024 ); // Reply: Ignored, 820d050900300500
+	}
 
 	// For some reason, if we don't do this sometimes the programmer starts in a hosey mode.
 	MCF.WriteReg32( d, DMCONTROL, 0x80000001 ); // Make the debug module work properly.
@@ -245,8 +256,7 @@ static int LESetupInterface( void * d )
 	MCF.WriteReg32( d, DMCONTROL, 0x80000001 ); // No, really make sure.
 	MCF.WriteReg32( d, DMABSTRACTCS, 0x00000700 ); // Ignore any pending errors.
 	MCF.WriteReg32( d, DMABSTRACTAUTO, 0 );
-	//MCF.WriteReg32( d, DMCOMMAND, 0x00261000 ); // Read x0 (Null command) //AH changed as part of CH32V307 discussion
-	MCF.WriteReg32( d, DMCOMMAND, 0x00221000 ); // Read x0 (Null command) //AH replacement based on that discussion
+	MCF.WriteReg32( d, DMCOMMAND, 0x00221000 ); // Read x0 (Null command) with nopostexec (to fix v307 read issues)
 
 	int r = 0;
 
@@ -261,17 +271,36 @@ static int LESetupInterface( void * d )
 	}
 
 	// This puts the processor on hold to allow the debugger to run.
-	wch_link_command( dev, "\x81\x11\x01\x09", 4, (int*)&transferred, rbuff, 1024 ); // Reply: Chip ID + Other data (see below)
+	// Recommended to switch to 05 from 09 by Alexander M
+	//	wch_link_command( dev, "\x81\x11\x01\x09", 4, (int*)&transferred, rbuff, 1024 ); // Reply: Chip ID + Other data (see below)
+	wch_link_command( dev, "\x81\x11\x01\x05", 4, (int*)&transferred, rbuff, 1024 ); // Reply: Chip ID + Other data (see below)
+
 	if( transferred != 20 )
 	{
 		fprintf( stderr, "Error: could not get part status\n" );
 		return -1;
 	}
-	fprintf( stderr, "Part Type (A): 0x%02x%02x (This is the capacity code, in KB)\n", rbuff[2], rbuff[3] );  // Is this Flash size?
+	fprintf( stderr, "Flash Storage: %d kB\n", (rbuff[2]<<8) | rbuff[3] );  // Is this Flash size?
 	fprintf( stderr, "Part UUID    : %02x-%02x-%02x-%02x-%02x-%02x-%02x-%02x\n", rbuff[4], rbuff[5], rbuff[6], rbuff[7], rbuff[8], rbuff[9], rbuff[10], rbuff[11] );
 	fprintf( stderr, "PFlags       : %02x-%02x-%02x-%02x\n", rbuff[12], rbuff[13], rbuff[14], rbuff[15] );
 	fprintf( stderr, "Part Type (B): %02x-%02x-%02x-%02x\n", rbuff[16], rbuff[17], rbuff[18], rbuff[19] );
-	
+
+	// Check for read protection
+	wch_link_command( dev, "\x81\x06\x01\x01", 4, (int*)&transferred, rbuff, 1024 );
+	if(transferred != 4) {
+		fprintf(stderr, "Error: could not get read protection status\n");
+		return -1;
+	}
+
+	if(rbuff[3] == 0x01) {
+		fprintf(stderr, "Read protection: enabled\n");
+	} else {
+		fprintf(stderr, "Read protection: disabled\n");
+	}
+
+	iss->flash_size = ((rbuff[2]<<8) | rbuff[3])*1024;
+	iss->target_chip_type = target_chip_type;
+
 	return 0;
 }
 
@@ -372,42 +401,41 @@ void * TryInit_WCHLinkE()
 };
 
 
-#if 0
-
-// In case you are using a non-CH32V003 board.
+#if 1
 
+// Flash Bootloader for V20x and V30x series MCUs
 
 const uint8_t * bootloader = (const uint8_t*)
-"\x21\x11\x22\xca\x26\xc8\x93\x77\x15\x00\x99\xcf\xb7\x06\x67\x45" \
-"\xb7\x27\x02\x40\x93\x86\x36\x12\x37\x97\xef\xcd\xd4\xc3\x13\x07" \
-"\xb7\x9a\xd8\xc3\xd4\xd3\xd8\xd3\x93\x77\x25\x00\x9d\xc7\xb7\x27" \
-"\x02\x40\x98\x4b\xad\x66\x37\x33\x00\x40\x13\x67\x47\x00\x98\xcb" \
-"\x98\x4b\x93\x86\xa6\xaa\x13\x67\x07\x04\x98\xcb\xd8\x47\x05\x8b" \
-"\x63\x16\x07\x10\x98\x4b\x6d\x9b\x98\xcb\x93\x77\x45\x00\xa9\xcb" \
-"\x93\x07\xf6\x03\x99\x83\x2e\xc0\x2d\x63\x81\x76\x3e\xc4\xb7\x32" \
-"\x00\x40\xb7\x27\x02\x40\x13\x03\xa3\xaa\xfd\x16\x98\x4b\xb7\x03" \
-"\x02\x00\x33\x67\x77\x00\x98\xcb\x02\x47\xd8\xcb\x98\x4b\x13\x67" \
-"\x07\x04\x98\xcb\xd8\x47\x05\x8b\x69\xe7\x98\x4b\x75\x8f\x98\xcb" \
-"\x02\x47\x13\x07\x07\x04\x3a\xc0\x22\x47\x7d\x17\x3a\xc4\x79\xf7" \
-"\x93\x77\x85\x00\xf1\xcf\x93\x07\xf6\x03\x2e\xc0\x99\x83\x37\x27" \
-"\x02\x40\x3e\xc4\x1c\x4b\xc1\x66\x2d\x63\xd5\x8f\x1c\xcb\x37\x07" \
-"\x00\x20\x13\x07\x07\x20\xb7\x27\x02\x40\xb7\x03\x08\x00\xb7\x32" \
-"\x00\x40\x13\x03\xa3\xaa\x94\x4b\xb3\xe6\x76\x00\x94\xcb\xd4\x47" \
-"\x85\x8a\xf5\xfe\x82\x46\xba\x84\x37\x04\x04\x00\x36\xc2\xc1\x46" \
-"\x36\xc6\x92\x46\x84\x40\x11\x07\x84\xc2\x94\x4b\xc1\x8e\x94\xcb" \
-"\xd4\x47\x85\x8a\xb1\xea\x92\x46\xba\x84\x91\x06\x36\xc2\xb2\x46" \
-"\xfd\x16\x36\xc6\xf9\xfe\x82\x46\xd4\xcb\x94\x4b\x93\xe6\x06\x04" \
-"\x94\xcb\xd4\x47\x85\x8a\x85\xee\xd4\x47\xc1\x8a\x85\xce\xd8\x47" \
-"\xb7\x06\xf3\xff\xfd\x16\x13\x67\x07\x01\xd8\xc7\x98\x4b\x21\x45" \
-"\x75\x8f\x98\xcb\x52\x44\xc2\x44\x61\x01\x02\x90\x23\x20\xd3\x00" \
-"\xf5\xb5\x23\xa0\x62\x00\x3d\xb7\x23\xa0\x62\x00\x55\xb7\x23\xa0" \
-"\x62\x00\xc1\xb7\x82\x46\x93\x86\x06\x04\x36\xc0\xa2\x46\xfd\x16" \
-"\x36\xc4\xb5\xf2\x98\x4b\xb7\x06\xf3\xff\xfd\x16\x75\x8f\x98\xcb" \
-"\x41\x89\x05\xcd\x2e\xc0\x0d\x06\x02\xc4\x09\x82\xb7\x07\x00\x20" \
-"\x32\xc6\x93\x87\x07\x20\x98\x43\x13\x86\x47\x00\xa2\x47\x82\x46" \
-"\x8a\x07\xb6\x97\x9c\x43\x63\x1c\xf7\x00\xa2\x47\x85\x07\x3e\xc4" \
-"\xa2\x46\x32\x47\xb2\x87\xe3\xe0\xe6\xfe\x01\x45\x61\xb7\x41\x45" \
-"\x51\xb7\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff" \
+"\x93\x77\x15\x00\x41\x11\x99\xcf\xb7\x06\x67\x45\xb7\x27\x02\x40" \
+"\x93\x86\x36\x12\x37\x97\xef\xcd\xd4\xc3\x13\x07\xb7\x9a\xd8\xc3" \
+"\xd4\xd3\xd8\xd3\x93\x77\x25\x00\x95\xc7\xb7\x27\x02\x40\x98\x4b" \
+"\xad\x66\x37\x38\x00\x40\x13\x67\x47\x00\x98\xcb\x98\x4b\x93\x86" \
+"\xa6\xaa\x13\x67\x07\x04\x98\xcb\xd8\x47\x05\x8b\x61\xeb\x98\x4b" \
+"\x6d\x9b\x98\xcb\x93\x77\x45\x00\xa9\xcb\x93\x07\xf6\x0f\xa1\x83" \
+"\x2e\xc0\x2d\x68\x81\x76\x3e\xc4\xb7\x08\x02\x00\xb7\x27\x02\x40" \
+"\x37\x33\x00\x40\x13\x08\xa8\xaa\xfd\x16\x98\x4b\x33\x67\x17\x01" \
+"\x98\xcb\x02\x47\xd8\xcb\x98\x4b\x13\x67\x07\x04\x98\xcb\xd8\x47" \
+"\x05\x8b\x41\xeb\x98\x4b\x75\x8f\x98\xcb\x02\x47\x13\x07\x07\x10" \
+"\x3a\xc0\x22\x47\x7d\x17\x3a\xc4\x69\xfb\x93\x77\x85\x00\xd5\xcb" \
+"\x93\x07\xf6\x0f\x2e\xc0\xa1\x83\x3e\xc4\x37\x27\x02\x40\x1c\x4b" \
+"\xc1\x66\x41\x68\xd5\x8f\x1c\xcb\xb7\x16\x00\x20\xb7\x27\x02\x40" \
+"\x93\x08\x00\x04\x37\x03\x20\x00\x98\x4b\x33\x67\x07\x01\x98\xcb" \
+"\xd8\x47\x05\x8b\x75\xff\x02\x47\x3a\xc2\x46\xc6\x32\x47\x0d\xef" \
+"\x98\x4b\x33\x67\x67\x00\x98\xcb\xd8\x47\x05\x8b\x75\xff\xd8\x47" \
+"\x41\x8b\x39\xc3\xd8\x47\xc1\x76\xfd\x16\x13\x67\x07\x01\xd8\xc7" \
+"\x98\x4b\x21\x45\x75\x8f\x98\xcb\x41\x01\x02\x90\x23\x20\xd8\x00" \
+"\x25\xb7\x23\x20\x03\x01\xa5\xb7\x12\x47\x13\x8e\x46\x00\x94\x42" \
+"\x14\xc3\x12\x47\x11\x07\x3a\xc2\x32\x47\x7d\x17\x3a\xc6\xd8\x47" \
+"\x09\x8b\x75\xff\xf2\x86\x5d\xb7\x02\x47\x13\x07\x07\x10\x3a\xc0" \
+"\x22\x47\x7d\x17\x3a\xc4\x49\xf3\x98\x4b\xc1\x76\xfd\x16\x75\x8f" \
+"\x98\xcb\x41\x89\x15\xc9\x2e\xc0\x0d\x06\x02\xc4\x09\x82\x32\xc6" \
+"\xb7\x17\x00\x20\x98\x43\x13\x86\x47\x00\xa2\x47\x82\x46\x8a\x07" \
+"\xb6\x97\x9c\x43\x63\x1c\xf7\x00\xa2\x47\x85\x07\x3e\xc4\xa2\x46" \
+"\x32\x47\xb2\x87\xe3\xe0\xe6\xfe\x01\x45\xbd\xbf\x41\x45\xad\xbf" \
+"\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff" \
+"\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff" \
+"\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff" \
+"\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff" \
 "\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff" \
 "\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff\xff";
 
@@ -439,6 +467,7 @@ static int InternalLinkEHaltMode( void * d, int mode )
 	return 0;
 }
 
+#if 0
 static int LEReadBinaryBlob( void * d, uint32_t offset, uint32_t amount, uint8_t * readbuff )
 {
 	libusb_device_handle * dev = ((struct LinkEProgrammerStruct*)d)->devh;
@@ -496,11 +525,12 @@ static int LEReadBinaryBlob( void * d, uint32_t offset, uint32_t amount, uint8_t
 
 	return 0;
 }
+#endif
 
-#if 0
 static int LEWriteBinaryBlob( void * d, uint32_t address_to_write, uint32_t len, uint8_t * blob )
 {
 	libusb_device_handle * dev = ((struct LinkEProgrammerStruct*)d)->devh;
+	struct InternalState * iss = (struct InternalState*)(((struct LinkEProgrammerStruct*)d)->internal);
 
 	InternalLinkEHaltMode( d, 0 );
 
@@ -509,24 +539,28 @@ static int LEWriteBinaryBlob( void * d, uint32_t address_to_write, uint32_t len,
 	uint8_t rbuff[1024];
 	int transferred;
 
-	int padlen = ((len-1) & (~0x3f)) + 0x40;
+	int padlen = ((len-1) & (~(iss->sector_size-1))) + iss->sector_size;
 
 	wch_link_command( (libusb_device_handle *)dev, "\x81\x06\x01\x01", 4, 0, 0, 0 );
 	wch_link_command( (libusb_device_handle *)dev, "\x81\x06\x01\x01", 4, 0, 0, 0 ); // Not sure why but it seems to work better when we request twice.
 
 	// This contains the write data quantity, in bytes.  (The last 2 octets)
 	// Then it just rollllls on in.
-	char rksbuff[11] = { 0x81, 0x01, 0x08, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00 };
-	rksbuff[9] = len >> 8;
-	rksbuff[10] = len & 0xff;
+	char rksbuff[11] = { 0x81, 0x01, 0x08,
+						 // Address to write
+						 (uint8_t)(address_to_write >> 24), (uint8_t)(address_to_write >> 16),
+						 (uint8_t)(address_to_write >> 8), (uint8_t)(address_to_write & 0xff),
+						 // Length to write
+						 (uint8_t)(len >> 24), (uint8_t)(len >> 16),
+						 (uint8_t)(len >> 8), (uint8_t)(len & 0xff) };
 	wch_link_command( (libusb_device_handle *)dev, rksbuff, 11, 0, 0, 0 );
 	
 	wch_link_command( (libusb_device_handle *)dev, "\x81\x02\x01\x05", 4, 0, 0, 0 );
 	
 	int pplace = 0;
-	for( pplace = 0; pplace < bootloader_len; pplace += 64 )
+	for( pplace = 0; pplace < bootloader_len; pplace += iss->sector_size )
 	{
-		WCHCHECK( libusb_bulk_transfer( (libusb_device_handle *)dev, 0x02, (uint8_t*)(bootloader+pplace), 64, &transferred, WCHTIMEOUT ) );
+		WCHCHECK( libusb_bulk_transfer( (libusb_device_handle *)dev, 0x02, (uint8_t*)(bootloader+pplace), iss->sector_size, &transferred, WCHTIMEOUT ) );
 	}
 	
 	for( i = 0; i < 10; i++ )
@@ -539,30 +573,29 @@ static int LEWriteBinaryBlob( void * d, uint32_t address_to_write, uint32_t len,
 	} 
 	if( i == 10 )
 	{
-		fprintf( stderr, "Error, confusing respones to 02/01/07\n" );
+		fprintf( stderr, "Error, confusing responses to 02/01/07\n" );
 		exit( -109 );
 	}
 	
 	wch_link_command( (libusb_device_handle *)dev, "\x81\x02\x01\x02", 4, 0, 0, 0 );
 
-	for( pplace = 0; pplace < padlen; pplace += 64 )
+	for( pplace = 0; pplace < padlen; pplace += iss->sector_size )
 	{
-		if( pplace + 64 > len )
+		if( pplace + iss->sector_size > len )
 		{
-			uint8_t paddeddata[64];
-			int gap = pplace + 64 - len;
+			uint8_t paddeddata[iss->sector_size];
+			int gap = pplace + iss->sector_size - len;
 			int okcopy = len - pplace;
 			memcpy( paddeddata, blob + pplace, okcopy );
 			memset( paddeddata + okcopy, 0xff, gap );
-			WCHCHECK( libusb_bulk_transfer( (libusb_device_handle *)dev, 0x02, paddeddata, 64, &transferred, WCHTIMEOUT ) );
+			WCHCHECK( libusb_bulk_transfer( (libusb_device_handle *)dev, 0x02, paddeddata, iss->sector_size, &transferred, WCHTIMEOUT ) );
 		}
 		else
 		{
-			WCHCHECK( libusb_bulk_transfer( (libusb_device_handle *)dev, 0x02, blob+pplace, 64, &transferred, WCHTIMEOUT ) );
+			WCHCHECK( libusb_bulk_transfer( (libusb_device_handle *)dev, 0x02, blob+pplace, iss->sector_size, &transferred, WCHTIMEOUT ) );
 		}
 	}
 	return 0;
 }
 
 
-#endif
diff --git a/minichlink/serial_dev.c b/minichlink/serial_dev.c
new file mode 100644
index 0000000000000000000000000000000000000000..af5543118357662082d8b7b257322bdd2ff54355
--- /dev/null
+++ b/minichlink/serial_dev.c
@@ -0,0 +1,173 @@
+#include "serial_dev.h"
+
+int serial_dev_create(serial_dev_t *dev, const char* port, unsigned baud) {
+	if (!dev) 
+		return -1;
+	dev->port = port;
+	dev->baud = baud;
+	#ifdef IS_WINDOWS
+	dev->handle = INVALID_HANDLE_VALUE;
+	#else
+	dev->fd = -1;
+	#endif
+	return 0;
+}
+
+int serial_dev_open(serial_dev_t *dev) {
+	fprintf(stderr, "Opening serial port %s at %u baud.\n", dev->port, dev->baud);
+#ifdef IS_WINDOWS
+	// Windows quirk: port = "COM10" is invalid, has to be encoded as "\\.\COM10".
+	// This also works for COM below 9. So, let's give the user the ability to use
+	// any "COMx" string and just prepend the "\\.\".
+	char winPortName[64];
+	if(dev->port[0] != '\\') {
+		snprintf(winPortName, sizeof(winPortName), "\\\\.\\%s", dev->port);
+	} else {
+		// copy verbatim if string already starts with a '\'
+		snprintf(winPortName, sizeof(winPortName), "%s", dev->port);
+	}
+	dev->handle = CreateFileA(winPortName, GENERIC_READ | GENERIC_WRITE, 0, 0, OPEN_EXISTING, 0,0);
+	if (dev->handle == INVALID_HANDLE_VALUE) {
+		if (GetLastError() == ERROR_FILE_NOT_FOUND) {
+			fprintf(stderr, "Serial port %s not found.\n", dev->port);
+			// weird: without this, errno = 0 (no error).
+			_set_errno(ERROR_FILE_NOT_FOUND);
+			return -1; // Device not found
+		}
+		// Error while opening the device
+		return -1;
+	}
+	DCB dcbSerialParams;
+	dcbSerialParams.DCBlength = sizeof(dcbSerialParams);
+	if (!GetCommState(dev->handle, &dcbSerialParams)) {
+		return -1;
+	}
+	// set baud and 8N1 serial formatting
+	dcbSerialParams.BaudRate = dev->baud;
+	dcbSerialParams.ByteSize = 8;
+	dcbSerialParams.StopBits = ONESTOPBIT;
+	dcbSerialParams.Parity = NOPARITY;
+	// write back
+	if (!SetCommState(dev->handle, &dcbSerialParams)){ 
+		return -1;
+	}
+	// Set the timeout parameters to "no timeout" (blocking).
+	// see https://learn.microsoft.com/en-us/windows/win32/api/winbase/ns-winbase-commtimeouts
+	COMMTIMEOUTS timeouts;
+	timeouts.ReadIntervalTimeout = 0;
+	timeouts.ReadTotalTimeoutConstant = MAXDWORD;
+	timeouts.ReadTotalTimeoutMultiplier = 0;
+	timeouts.WriteTotalTimeoutConstant = MAXDWORD;
+	timeouts.WriteTotalTimeoutMultiplier = 0;
+	// Write the parameters
+	if (!SetCommTimeouts(dev->handle, &timeouts)) {
+		return -1;
+	}
+#else
+	struct termios attr;
+	if ((dev->fd = open(dev->port, O_RDWR | O_NOCTTY)) == -1) {
+		perror("open");
+		return -1;
+	}
+
+	if (tcgetattr(dev->fd, &attr) == -1) {
+		perror("tcgetattr");
+		return -1;
+	}
+
+	cfmakeraw(&attr);
+	cfsetspeed(&attr, dev->baud);
+
+	if (tcsetattr(dev->fd, TCSANOW, &attr) == -1) {
+		perror("tcsetattr");
+		return -1;
+	}
+#endif
+	// all okay if we get here
+	return 0;
+}
+
+int serial_dev_write(serial_dev_t *dev, const void* data, size_t len) {
+#ifdef IS_WINDOWS
+	DWORD dwBytesWritten;
+	if (!WriteFile(dev->handle, data, len, &dwBytesWritten,NULL)) {
+		return -1;
+	}
+	return (int) dwBytesWritten;
+#else
+	return write(dev->fd, data, len);
+#endif
+}
+
+int serial_dev_read(serial_dev_t *dev, void* data, size_t len) {
+#ifdef IS_WINDOWS
+	DWORD dwBytesRead = 0;
+	if (!ReadFile(dev->handle, data, len, &dwBytesRead, NULL)) {
+		return -1;
+	}
+	return (int) dwBytesRead;
+#else
+	return read(dev->fd, data, len);
+#endif
+}
+
+int serial_dev_do_dtr_reset(serial_dev_t *dev) {
+#ifdef IS_WINDOWS
+	// EscapeCommFunction returns 0 on fail
+	if(EscapeCommFunction(dev->handle, SETDTR) == 0) {
+		return -1;
+	}
+	if(EscapeCommFunction(dev->handle, CLRDTR) == 0) {
+		return -1;
+	}
+#else
+	int argp = TIOCM_DTR;
+	// Arduino DTR reset.
+	if (ioctl(dev->fd, TIOCMBIC, &argp) == -1) {
+		perror("ioctl");
+		return -1;
+	}
+
+	if (tcdrain(dev->fd) == -1) {
+		perror("tcdrain");
+		return -1;
+	}
+
+	if (ioctl(dev->fd, TIOCMBIS, &argp) == -1) {
+		perror("ioctl");
+		return -1;
+	}
+#endif
+	return 0;
+}
+
+int serial_dev_flush_rx(serial_dev_t *dev) {
+#ifdef IS_WINDOWS
+	// PurgeComm returns 0 on fail
+	if (PurgeComm(dev->handle, PURGE_RXCLEAR) == 0) {
+		return -1;
+	}
+#else
+	if (tcflush(dev->fd, TCIFLUSH) == -1) {
+		perror("tcflush");
+		return -1;
+	}
+#endif
+	return 0;
+}
+
+int serial_dev_close(serial_dev_t *dev) {
+#ifdef IS_WINDOWS
+	if(!CloseHandle(dev->handle)) {
+		return -1;
+	}
+	dev->handle = INVALID_HANDLE_VALUE;
+#else
+	int ret = 0;
+	if((ret = close(dev->fd)) != 0) {
+		return ret;
+	}
+	dev->fd = -1;
+#endif
+	return 0;
+}
\ No newline at end of file
diff --git a/minichlink/serial_dev.h b/minichlink/serial_dev.h
new file mode 100644
index 0000000000000000000000000000000000000000..7c7e4f1eea1f1b369b08f0508739112a84c60ca5
--- /dev/null
+++ b/minichlink/serial_dev.h
@@ -0,0 +1,48 @@
+#ifndef _SERIAL_DEV_H
+#define _SERIAL_DEV_H
+
+#include <stddef.h>
+
+#if defined(WINDOWS) || defined(WIN32) || defined(_WIN32)
+#define WIN32_LEAN_AND_MEAN
+#include <windows.h>
+#define IS_WINDOWS
+#define DEFAULT_SERIAL_NAME "\\\\.\\COM3"
+#else
+#include <unistd.h>
+#include <termios.h>
+#include <fcntl.h>
+#include <sys/ioctl.h>
+#define IS_POSIX
+#define DEFAULT_SERIAL_NAME "/dev/ttyACM0"
+#endif
+/* these are available on all platforms */
+#include <errno.h>
+#include <stdio.h>
+
+typedef struct {
+    const char* port;
+    unsigned baud;
+#ifdef IS_WINDOWS
+    HANDLE handle;
+#else
+    int fd;
+#endif
+} serial_dev_t;
+
+/* returns 0 if OK */
+int serial_dev_create(serial_dev_t *dev, const char* port, unsigned baud);
+/* returns 0 if OK */
+int serial_dev_open(serial_dev_t *dev);
+/* returns -1 on write error */
+int serial_dev_write(serial_dev_t *dev, const void* data, size_t len);
+/* returns -1 on read error */
+int serial_dev_read(serial_dev_t *dev, void* data, size_t len);
+/* returns -1 on reset error */
+int serial_dev_do_dtr_reset(serial_dev_t *dev);
+/* returns -1 on flush error */
+int serial_dev_flush_rx(serial_dev_t *dev);
+/* returns -1 on close error */
+int serial_dev_close(serial_dev_t *dev);
+
+#endif
diff --git a/minichlink/terminalhelp.h b/minichlink/terminalhelp.h
new file mode 100644
index 0000000000000000000000000000000000000000..38cbcf16a89821974d37b5530a476629936b0fcc
--- /dev/null
+++ b/minichlink/terminalhelp.h
@@ -0,0 +1,159 @@
+// terminalhelp from mini-rv32ima.
+#ifndef _TERMINALHELP_H
+#define _TERMINALHELP_H
+
+#include <stdint.h>
+
+// Provides the following:
+static void CaptureKeyboardInput()    __attribute__((used));
+static void ResetKeyboardInput()      __attribute__((used));
+static uint64_t GetTimeMicroseconds() __attribute__((used));
+static int ReadKBByte()               __attribute__((used));
+static int IsKBHit()                  __attribute__((used));
+
+#if defined(WINDOWS) || defined(WIN32) || defined(_WIN32)
+
+#include <windows.h>
+#include <conio.h>
+
+#define strtoll _strtoi64
+
+static void CaptureKeyboardInput()
+{
+	system(""); // Poorly documented tick: Enable VT100 Windows mode.
+}
+
+static void ResetKeyboardInput()
+{
+}
+
+static uint64_t GetTimeMicroseconds()
+{
+	static LARGE_INTEGER lpf;
+	LARGE_INTEGER li;
+
+	if( !lpf.QuadPart )
+		QueryPerformanceFrequency( &lpf );
+
+	QueryPerformanceCounter( &li );
+	return ((uint64_t)li.QuadPart * 1000000LL) / (uint64_t)lpf.QuadPart;
+}
+
+
+static int IsKBHit()
+{
+	return _kbhit();
+}
+
+static int ReadKBByte()
+{
+	// This code is kind of tricky, but used to convert windows arrow keys
+	// to VT100 arrow keys.
+	static int is_escape_sequence = 0;
+	int r;
+	if( is_escape_sequence == 1 )
+	{
+		is_escape_sequence++;
+		return '[';
+	}
+
+	r = _getch();
+
+	if( is_escape_sequence )
+	{
+		is_escape_sequence = 0;
+		switch( r )
+		{
+			case 'H': return 'A'; // Up
+			case 'P': return 'B'; // Down
+			case 'K': return 'D'; // Left
+			case 'M': return 'C'; // Right
+			case 'G': return 'H'; // Home
+			case 'O': return 'F'; // End
+			default: return r; // Unknown code.
+		}
+	}
+	else
+	{
+		switch( r )
+		{
+			case 13: return 10; //cr->lf
+			case 224: is_escape_sequence = 1; return 27; // Escape arrow keys
+			default: return r;
+		}
+	}
+}
+
+#else
+
+#include <sys/ioctl.h>
+#include <termios.h>
+#undef BS0
+#undef BS1
+#include <unistd.h>
+#include <signal.h>
+#include <sys/time.h>
+
+static void CtrlC()
+{
+	exit( 0 );
+}
+
+// Override keyboard, so we can capture all keyboard input for the VM.
+static void CaptureKeyboardInput()
+{
+	// Hook exit, because we want to re-enable keyboard.
+	atexit(ResetKeyboardInput);
+	signal(SIGINT, CtrlC);
+
+	struct termios term;
+	tcgetattr(0, &term);
+	term.c_lflag &= ~(ICANON | ECHO); // Disable echo as well
+	tcsetattr(0, TCSANOW, &term);
+}
+
+static void ResetKeyboardInput()
+{
+	// Re-enable echo, etc. on keyboard.
+	struct termios term;
+	tcgetattr(0, &term);
+	term.c_lflag |= ICANON | ECHO;
+	tcsetattr(0, TCSANOW, &term);
+}
+
+static uint64_t GetTimeMicroseconds()
+{
+	struct timeval tv;
+	gettimeofday( &tv, 0 );
+	return tv.tv_usec + ((uint64_t)(tv.tv_sec)) * 1000000LL;
+}
+
+static int is_eofd;
+
+static int ReadKBByte()
+{
+	if( is_eofd ) return 0xffffffff;
+	char rxchar = 0;
+	int rread = read(fileno(stdin), (char*)&rxchar, 1);
+
+	if( rread > 0 ) // Tricky: getchar can't be used with arrow keys.
+		return rxchar;
+	else
+		return -1;
+}
+
+static int IsKBHit()
+{
+	if( is_eofd ) return -1;
+	int byteswaiting;
+	ioctl(0, FIONREAD, &byteswaiting);
+	if( !byteswaiting && write( fileno(stdin), 0, 0 ) != 0 ) { is_eofd = 1; return -1; } // Is end-of-file for 
+	return !!byteswaiting;
+}
+
+
+#endif
+
+
+#endif
+
diff --git a/minichlink/winbuild.bat b/minichlink/winbuild.bat
index 785472441f6d5d1db88cd6282afe61fc9f21a39b..9c4095bc4ce65dcea556e50f2dc1a094cc04f303 100644
--- a/minichlink/winbuild.bat
+++ b/minichlink/winbuild.bat
@@ -1 +1 @@
-tcc minichlink.c pgm-esp32s2-ch32xx.c  pgm-wch-linke.c minichgdb.c nhc-link042.c -DWIN32 -lws2_32 -lsetupapi libusb-1.0.dll 
+tcc minichlink.c pgm-esp32s2-ch32xx.c serial_dev.c ardulink.c pgm-b003fun.c pgm-wch-linke.c minichgdb.c nhc-link042.c -DWIN32 -lws2_32 -lsetupapi libusb-1.0.dll 
diff --git a/platformio.ini b/platformio.ini
index 3edc591904e025923f2883664a83269f1e1840eb..498c9b2dd0204d159f5d35d9cf586e6c2397753c 100644
--- a/platformio.ini
+++ b/platformio.ini
@@ -45,8 +45,8 @@ build_src_filter = ${fun_base.build_src_filter} +<examples/optionbytes>
 [env:run_from_ram]
 build_src_filter = ${fun_base.build_src_filter} +<examples/run_from_ram>
 
-[env:sandbox]
-build_src_filter = ${fun_base.build_src_filter} +<examples/sandbox>
+[env:template]
+build_src_filter = ${fun_base.build_src_filter} +<examples/template>
 
 [env:self_modify_code]
 build_src_filter = ${fun_base.build_src_filter} +<examples/self_modify_code>