Skip to content

记一次 SpringBoot 中文乱码问题调查

🏷️ Spring Cloud Spring Boot

现象

  • 现象是请求中的中文保存到数据库后会乱码。
  • 乱码的字符并不是什么特殊字符。
  • 删除了乱码字符前面的字符后,乱码的地方会向后偏移。

调查过程

第一反应是数据库字段的字符集设置导致的,但修改成 utf8mb4 字符集后问题依旧。

通过本地调试发现,直接请求接口的字符串并没有乱码。

通过测试环境日志发现,Controller 接收到的参数中字符串已经乱码了。

测试环境和开发环境的区别是其请求是通过网关转发的。

调查网关后,发现其中有一个 Filter 曾对请求内容进行了转码处理。具体代码如下:

java
import java.nio.charset.StandardCharsets;

import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.autoconfigure.condition.ConditionalOnProperty;
import org.springframework.cloud.gateway.filter.GatewayFilterChain;
import org.springframework.cloud.gateway.filter.GlobalFilter;
import org.springframework.core.Ordered;
import org.springframework.core.io.buffer.DataBuffer;
import org.springframework.core.io.buffer.DataBufferUtils;
import org.springframework.core.io.buffer.NettyDataBufferFactory;
import org.springframework.http.HttpHeaders;
import org.springframework.http.HttpMethod;
import org.springframework.http.MediaType;
import org.springframework.http.server.reactive.ServerHttpRequest;
import org.springframework.http.server.reactive.ServerHttpRequestDecorator;
import org.springframework.stereotype.Component;
import org.springframework.web.server.ServerWebExchange;
import io.netty.buffer.ByteBufAllocator;
import reactor.core.publisher.Flux;
import reactor.core.publisher.Mono;

/**
 * 跨站脚本过滤器
 */
@Component
@ConditionalOnProperty(value = "security.xss.enabled", havingValue = "true")
public class XssFilter implements GlobalFilter, Ordered
{
    // 跨站脚本的 xss 配置,nacos 自行添加
    @Autowired
    private XssProperties xss;

    @Override
    public Mono<Void> filter(ServerWebExchange exchange, GatewayFilterChain chain)
    {
        ServerHttpRequest request = exchange.getRequest();
        // GET DELETE 不过滤
        HttpMethod method = request.getMethod();
        if (method == null || method.matches("GET") || method.matches("DELETE"))
        {
            return chain.filter(exchange);
        }
        // 非 json 类型,不过滤
        if (!isJsonRequest(exchange))
        {
            return chain.filter(exchange);
        }
        // excludeUrls 不过滤
        String url = request.getURI().getPath();
        if (StringUtils.matches(url, xss.getExcludeUrls()))
        {
            return chain.filter(exchange);
        }
        ServerHttpRequestDecorator httpRequestDecorator = requestDecorator(exchange);
        return chain.filter(exchange.mutate().request(httpRequestDecorator).build());

    }

    private ServerHttpRequestDecorator requestDecorator(ServerWebExchange exchange)
    {
        ServerHttpRequestDecorator serverHttpRequestDecorator = new ServerHttpRequestDecorator(exchange.getRequest())
        {
            @Override
            public Flux<DataBuffer> getBody()
            {
                Flux<DataBuffer> body = super.getBody();
                return body.map(dataBuffer -> {
                    byte[] content = new byte[dataBuffer.readableByteCount()];
                    dataBuffer.read(content);
                    DataBufferUtils.release(dataBuffer);
                    String bodyStr = new String(content, StandardCharsets.UTF_8);
                    // 防 xss 攻击过滤
                    bodyStr = EscapeUtil.clean(bodyStr);
                    // 转成字节
                    byte[] bytes = bodyStr.getBytes();
                    NettyDataBufferFactory nettyDataBufferFactory = new NettyDataBufferFactory(ByteBufAllocator.DEFAULT);
                    DataBuffer buffer = nettyDataBufferFactory.allocateBuffer(bytes.length);
                    buffer.write(bytes);
                    return buffer;
                });
            }

            @Override
            public HttpHeaders getHeaders()
            {
                HttpHeaders httpHeaders = new HttpHeaders();
                httpHeaders.putAll(super.getHeaders());
                // 由于修改了请求体的 body,导致 content-length 长度不确定,因此需要删除原先的 content-length
                httpHeaders.remove(HttpHeaders.CONTENT_LENGTH);
                httpHeaders.set(HttpHeaders.TRANSFER_ENCODING, "chunked");
                return httpHeaders;
            }

        };
        return serverHttpRequestDecorator;
    }

    /**
     * 是否是 Json 请求
     *
     * @param
     */
    public boolean isJsonRequest(ServerWebExchange exchange)
    {
        String header = exchange.getRequest().getHeaders().getFirst(HttpHeaders.CONTENT_TYPE);
        return StringUtils.startsWithIgnoreCase(header, MediaType.APPLICATION_JSON_VALUE);
    }

    @Override
    public int getOrder()
    {
        return -100;
    }
}

本地调试网关服务时发现出问题的是 getBody() 方法。
当请求 body 较长时,byte[] content = new byte[dataBuffer.readableByteCount()]; 变量中获取到的并不是全部的请求体,而只是其中的部分内容。
这就解释了为什么会固定在一定长度的时候会乱码。

java
@Override
public Flux<DataBuffer> getBody()
{
    Flux<DataBuffer> body = super.getBody();
    return body.map(dataBuffer -> {
        byte[] content = new byte[dataBuffer.readableByteCount()];
        dataBuffer.read(content);
        DataBufferUtils.release(dataBuffer);
        String bodyStr = new String(content, StandardCharsets.UTF_8);
        // 防 xss 攻击过滤
        bodyStr = EscapeUtil.clean(bodyStr);
        // 转成字节
        byte[] bytes = bodyStr.getBytes();
        NettyDataBufferFactory nettyDataBufferFactory = new NettyDataBufferFactory(ByteBufAllocator.DEFAULT);
        DataBuffer buffer = nettyDataBufferFactory.allocateBuffer(bytes.length);
        buffer.write(bytes);
        return buffer;
    });
}

这篇博客中博主遇到了同样的问题,并给出了一种解决方案

java
.route("route1",
    r -> r.method(HttpMethod.POST)
            .and().readBody(String.class, requestBody -> {
        // 这里不对 body 做判断处理
        return true;
    }).and().path(SERVICE1)
    .filters(f -> {
        f.filters(list);
        return f;
    }).uri(URI));

但是这种方法需要修改代码,而现在项目中网关的路由配置一般都是放在配置中心上的。
查看 readBody() 方法的代码,发现这里指定的是一个 Predicate (断言)。

参考这篇博客可以通过修改网关配置文件实现相同的效果。

修改后网关路由配置如下:

yaml
spring:
  cloud:
    gateway:
      routes:
        - id: some-system
          uri: lb://some-system
          filters:
            - StripPrefix=1
          predicates:
            - Path=/some-system/**
            - name: ReadBodyPredicateFactory
              args:
                inClass: '#{T(String)}'

之后启动网关服务时报了如下错误:

Unable to find RoutePredicateFactory with name ReadBodyPredicateFactory

完整错误消息堆栈如下:

java
14:38:40.771 [boundedElastic-27] ERROR o.s.c.g.r.CachingRouteLocator - [handleRefreshError,94] - Refresh routes error !!!
java.lang.IllegalArgumentException: Unable to find RoutePredicateFactory with name ReadBodyPredicateFactory
	at org.springframework.cloud.gateway.route.RouteDefinitionRouteLocator.lookup(RouteDefinitionRouteLocator.java:203)
	at org.springframework.cloud.gateway.route.RouteDefinitionRouteLocator.combinePredicates(RouteDefinitionRouteLocator.java:192)
	at org.springframework.cloud.gateway.route.RouteDefinitionRouteLocator.convertToRoute(RouteDefinitionRouteLocator.java:116)
	at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:106)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.tryEmitScalar(FluxFlatMap.java:488)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.onNext(FluxFlatMap.java:421)
	at reactor.core.publisher.FluxMergeSequential$MergeSequentialMain.drain(FluxMergeSequential.java:432)
	at reactor.core.publisher.FluxMergeSequential$MergeSequentialMain.innerComplete(FluxMergeSequential.java:328)
	at reactor.core.publisher.FluxMergeSequential$MergeSequentialInner.onComplete(FluxMergeSequential.java:584)
	at reactor.core.publisher.FluxMap$MapSubscriber.onComplete(FluxMap.java:142)
	at reactor.core.publisher.FluxFilter$FilterSubscriber.onComplete(FluxFilter.java:166)
	at reactor.core.publisher.FluxMap$MapConditionalSubscriber.onComplete(FluxMap.java:269)
	at reactor.core.publisher.FluxFilter$FilterConditionalSubscriber.onComplete(FluxFilter.java:300)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.checkTerminated(FluxFlatMap.java:846)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.drainLoop(FluxFlatMap.java:608)
	at reactor.core.publisher.FluxFlatMap$FlatMapMain.innerComplete(FluxFlatMap.java:894)
	at reactor.core.publisher.FluxFlatMap$FlatMapInner.onComplete(FluxFlatMap.java:997)
	at reactor.core.publisher.Operators$MonoSubscriber.complete(Operators.java:1799)
	at reactor.core.publisher.MonoCollectList$MonoCollectListSubscriber.onComplete(MonoCollectList.java:128)
	at org.springframework.cloud.commons.publisher.FluxFirstNonEmptyEmitting$FirstNonEmptyEmittingSubscriber.onComplete(FluxFirstNonEmptyEmitting.java:325)
	at reactor.core.publisher.FluxSubscribeOn$SubscribeOnSubscriber.onComplete(FluxSubscribeOn.java:166)
	at reactor.core.publisher.FluxIterable$IterableSubscription.fastPath(FluxIterable.java:360)
	at reactor.core.publisher.FluxIterable$IterableSubscription.request(FluxIterable.java:225)
	at reactor.core.publisher.FluxSubscribeOn$SubscribeOnSubscriber.requestUpstream(FluxSubscribeOn.java:131)
	at reactor.core.publisher.FluxSubscribeOn$SubscribeOnSubscriber.onSubscribe(FluxSubscribeOn.java:124)
	at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:164)
	at reactor.core.publisher.FluxIterable.subscribe(FluxIterable.java:86)
	at reactor.core.publisher.Flux.subscribe(Flux.java:8402)
	at reactor.core.publisher.FluxFlatMap.trySubscribeScalarMap(FluxFlatMap.java:200)
	at reactor.core.publisher.MonoFlatMapMany.subscribeOrReturn(MonoFlatMapMany.java:49)
	at reactor.core.publisher.FluxFromMonoOperator.subscribe(FluxFromMonoOperator.java:76)
	at reactor.core.publisher.FluxSubscribeOn$SubscribeOnSubscriber.run(FluxSubscribeOn.java:194)
	at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:84)
	at reactor.core.scheduler.WorkerTask.call(WorkerTask.java:37)
	at java.base/java.util.concurrent.FutureTask.run$$$capture(FutureTask.java:264)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java)
	at java.base/java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:304)
	at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1130)
	at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:630)
	at java.base/java.lang.Thread.run(Thread.java:832)

报错的方法源码如下:

java
@SuppressWarnings("unchecked")
private AsyncPredicate<ServerWebExchange> lookup(RouteDefinition route, PredicateDefinition predicate) {
    RoutePredicateFactory<Object> factory = this.predicates.get(predicate.getName());
    if (factory == null) {
        throw new IllegalArgumentException("Unable to find RoutePredicateFactory with name " + predicate.getName());
    }
    if (logger.isDebugEnabled()) {
        logger.debug("RouteDefinition " + route.getId() + " applying " + predicate.getArgs() + " to "
                + predicate.getName());
    }

    // @formatter:off
    Object config = this.configurationService.with(factory)
            .name(predicate.getName())
            .properties(predicate.getArgs())
            .eventFunction((bound, properties) -> new PredicateArgsEvent(
                    RouteDefinitionRouteLocator.this, route.getId(), properties))
            .bind();
    // @formatter:on

    return factory.applyAsync(config);
}

Debug 发现 this.predicates 的内容如下:

java
predicates = {LinkedHashMap@11025}  size = 13
 "After" -> {AfterRoutePredicateFactory@14744} "[AfterRoutePredicateFactory@7ea8224b configClass = AfterRoutePredicateFactory.Config]"
 "Before" -> {BeforeRoutePredicateFactory@14746} "[BeforeRoutePredicateFactory@5a010eec configClass = BeforeRoutePredicateFactory.Config]"
 "Between" -> {BetweenRoutePredicateFactory@14748} "[BetweenRoutePredicateFactory@623ded82 configClass = BetweenRoutePredicateFactory.Config]"
 "Cookie" -> {CookieRoutePredicateFactory@14750} "[CookieRoutePredicateFactory@180e33b0 configClass = CookieRoutePredicateFactory.Config]"
 "Header" -> {HeaderRoutePredicateFactory@14752} "[HeaderRoutePredicateFactory@270be080 configClass = HeaderRoutePredicateFactory.Config]"
 "Host" -> {HostRoutePredicateFactory@14754} "[HostRoutePredicateFactory@752ffce3 configClass = HostRoutePredicateFactory.Config]"
 "Method" -> {MethodRoutePredicateFactory@14756} "[MethodRoutePredicateFactory@78f35e39 configClass = MethodRoutePredicateFactory.Config]"
 "Path" -> {PathRoutePredicateFactory@14758} "[PathRoutePredicateFactory@11896124 configClass = PathRoutePredicateFactory.Config]"
 "Query" -> {QueryRoutePredicateFactory@14760} "[QueryRoutePredicateFactory@69fe8c75 configClass = QueryRoutePredicateFactory.Config]"
 "ReadBody" -> {ReadBodyRoutePredicateFactory@14762} "[ReadBodyRoutePredicateFactory@633cad4d configClass = ReadBodyRoutePredicateFactory.Config]"
 "RemoteAddr" -> {RemoteAddrRoutePredicateFactory@14764} "[RemoteAddrRoutePredicateFactory@15c3585 configClass = RemoteAddrRoutePredicateFactory.Config]"
 "Weight" -> {WeightRoutePredicateFactory@14766} "[WeightRoutePredicateFactory@5b86f4cb configClass = WeightConfig]"
 "CloudFoundryRouteService" -> {CloudFoundryRouteServiceRoutePredicateFactory@14768} "[CloudFoundryRouteServiceRoutePredicateFactory@468646ea configClass = Object]"

也就是说配置文件中 predicate.name 应该为 ReadBody 而不是 ReadBodyPredicateFactory

this.predicates 赋值相关的代码如下,从中可以看到在 NameUtils.normalizeRoutePredicateName 方法中去除了后缀 RoutePredicateFactory

估计这是由于项目中使用的 Gateway 版本比较新(spring-cloud-gateway-server:3.0.3)导致的。

RouteDefinitionRouteLocator.java

java
private void initFactories(List<RoutePredicateFactory> predicates) {
    predicates.forEach(factory -> {
        String key = factory.name();
        if (this.predicates.containsKey(key)) {
            this.logger.warn("A RoutePredicateFactory named " + key + " already exists, class: "
                    + this.predicates.get(key) + ". It will be overwritten.");
        }
        this.predicates.put(key, factory);
        if (logger.isInfoEnabled()) {
            logger.info("Loaded RoutePredicateFactory [" + key + "]");
        }
    });
}

RoutePredicateFactory.java

java
default String name() {
    return NameUtils.normalizeRoutePredicateName(getClass());
}

NameUtils.java

java
public static String normalizeRoutePredicateName(Class<? extends RoutePredicateFactory> clazz) {
    return removeGarbage(clazz.getSimpleName().replace(RoutePredicateFactory.class.getSimpleName(), ""));
}

private static String removeGarbage(String s) {
    int garbageIdx = s.indexOf("$Mockito");
    if (garbageIdx > 0) {
        return s.substring(0, garbageIdx);
    }

    return s;
}

初次之外还需要配置一个 Predicate 类型的 Bean 并 自定义个 RoutePredicateFactory (从 ReadBodyRoutePredicateFactory 复制修改)。
否则会出现 NPE 和 404 NOT FOUND 错误。(解决方法参考自前面提到的博客另一篇博客

NPE 错误如下:

java
16:13:00.409 [reactor-http-epoll-3] ERROR o.s.c.g.h.RoutePredicateHandlerMapping - [lambda$null$3,132] - Error applying predicate for route: launch-tencent
java.lang.NullPointerException: null
    at org.springframework.cloud.gateway.handler.predicate.ReadBodyRoutePredicateFactory$1.lambda$null$1(ReadBodyRoutePredicateFactory.java:96)
    at reactor.core.publisher.FluxMap$MapSubscriber.onNext(FluxMap.java:106)
    at reactor.core.publisher.FluxPeek$PeekSubscriber.onNext(FluxPeek.java:200)

报错的代码是最后一行 config.getPredicate().test(objectValue) ,由于没有配置 config.getPredicate() 导致了 NPE。

java
return ServerWebExchangeUtils.cacheRequestBodyAndRequest(exchange,
    (serverHttpRequest) -> ServerRequest
        .create(exchange.mutate().request(serverHttpRequest).build(), messageReaders)
        .bodyToMono(inClass).doOnNext(objectValue -> exchange.getAttributes()
            .put(CACHE_REQUEST_BODY_OBJECT_KEY, objectValue))
        .map(objectValue -> config.getPredicate().test(objectValue)));

添加一个自定义的断言 alwaysTruePredicate ,并在配置文件的 args 中指定使用这个断言 predicate: '#{@alwaysTruePredicate}'

java
@Bean
public Predicate alwaysTruePredicate(){
    return new Predicate() {
        @Override
        public boolean test(Object o) {
            return true;
        }
    };
}

404 NOT FOUND 错误需要在 ReadBodyRoutePredicateFactory 的基础上,在 .map(objectValue -> config.getPredicate().test(objectValue)) 的后面加一段 .thenReturn(true) 处理,使其永远返回 true

对策

  1. 添加工厂类 GwReadBodyRoutePredicateFactory

    java
    import lombok.extern.slf4j.Slf4j;
    import org.reactivestreams.Publisher;
    import org.springframework.cloud.gateway.handler.AsyncPredicate;
    import org.springframework.cloud.gateway.handler.predicate.AbstractRoutePredicateFactory;
    import org.springframework.cloud.gateway.support.ServerWebExchangeUtils;
    import org.springframework.http.codec.HttpMessageReader;
    import org.springframework.stereotype.Component;
    import org.springframework.web.reactive.function.server.HandlerStrategies;
    import org.springframework.web.reactive.function.server.ServerRequest;
    import org.springframework.web.server.ServerWebExchange;
    import reactor.core.publisher.Mono;
    
    import java.util.List;
    import java.util.Map;
    import java.util.function.Predicate;
    
    /**
     * Predicate that reads the body and applies a user provided predicate to run on the body.
     * The body is cached in memory so that possible subsequent calls to the predicate do not
     * need to deserialize again.
     */
    @Component
    @Slf4j
    public class GwReadBodyRoutePredicateFactory extends
            AbstractRoutePredicateFactory<GwReadBodyRoutePredicateFactory.Config> {
    
        private static final String TEST_ATTRIBUTE = "read_body_predicate_test_attribute";
    
        private static final String CACHE_REQUEST_BODY_OBJECT_KEY = "cachedRequestBodyObject";
    
        private final List<HttpMessageReader<?>> messageReaders;
    
        public GwReadBodyRoutePredicateFactory() {
            super(GwReadBodyRoutePredicateFactory.Config.class);
            this.messageReaders = HandlerStrategies.withDefaults().messageReaders();
        }
    
        public GwReadBodyRoutePredicateFactory(List<HttpMessageReader<?>> messageReaders) {
            super(GwReadBodyRoutePredicateFactory.Config.class);
            this.messageReaders = messageReaders;
        }
    
        @Override
        @SuppressWarnings("unchecked")
        public AsyncPredicate<ServerWebExchange> applyAsync(
                GwReadBodyRoutePredicateFactory.Config config
        ) {
            return new AsyncPredicate<ServerWebExchange>() {
                @Override
                public Publisher<Boolean> apply(ServerWebExchange exchange) {
                    Class inClass = config.getInClass();
    
                    Object cachedBody = exchange.getAttribute(CACHE_REQUEST_BODY_OBJECT_KEY);
                    Mono<?> modifiedBody;
                    // We can only read the body from the request once, once that happens if
                    // we try to read the body again an exception will be thrown. The below
                    // if/else caches the body object as a request attribute in the
                    // ServerWebExchange so if this filter is run more than once (due to more
                    // than one route using it) we do not try to read the request body
                    // multiple times
                    if (cachedBody != null) {
                        try {
                            boolean test = config.predicate.test(cachedBody);
                            exchange.getAttributes().put(TEST_ATTRIBUTE, test);
                            return Mono.just(test);
                        } catch (ClassCastException e) {
                            if (log.isDebugEnabled()) {
                                log.debug("Predicate test failed because class in predicate "
                                        + "does not match the cached body object", e);
                            }
                        }
                        return Mono.just(false);
                    } else {
                        return ServerWebExchangeUtils.cacheRequestBodyAndRequest(exchange,
                                (serverHttpRequest) -> ServerRequest.create(
                                                exchange.mutate().request(serverHttpRequest).build(), messageReaders)
                                        .bodyToMono(inClass)
                                        .doOnNext(objectValue -> exchange.getAttributes()
                                                .put(CACHE_REQUEST_BODY_OBJECT_KEY, objectValue))
                                        .map(objectValue -> config.getPredicate().test(objectValue))
                                        .thenReturn(true));
                    }
                }
    
                @Override
                public String toString() {
                    return String.format("ReadBody: %s", config.getInClass());
                }
            };
        }
    
        @Override
        @SuppressWarnings("unchecked")
        public Predicate<ServerWebExchange> apply(
                GwReadBodyRoutePredicateFactory.Config config
        ) {
            throw new UnsupportedOperationException("GwReadBodyPredicateFactory is only async.");
        }
    
        public static class Config {
    
            private Class inClass;
    
            private Predicate predicate;
    
            private Map<String, Object> hints;
    
            public Class getInClass() {
                return inClass;
            }
    
            public GwReadBodyRoutePredicateFactory.Config setInClass(Class inClass) {
                this.inClass = inClass;
                return this;
            }
    
            public Predicate getPredicate() {
                return predicate;
            }
    
            public GwReadBodyRoutePredicateFactory.Config setPredicate(Predicate predicate) {
                this.predicate = predicate;
                return this;
            }
    
            public <T> GwReadBodyRoutePredicateFactory.Config setPredicate(Class<T> inClass, Predicate<T> predicate) {
                setInClass(inClass);
                this.predicate = predicate;
                return this;
            }
    
            public Map<String, Object> getHints() {
                return hints;
            }
    
            public GwReadBodyRoutePredicateFactory.Config setHints(Map<String, Object> hints) {
                this.hints = hints;
                return this;
            }
        }
    }
  2. 添加一个 GwReadBody 用的断言 Bean alwaysTruePredicate

    java
    @Bean
    public Predicate alwaysTruePredicate(){
        return new Predicate() {
            @Override
            public boolean test(Object o) {
                return true;
            }
        };
    }
  3. 在网关的路由配置中添加 GwReadBody 断言配置,将接收类型指定为 String,并将 predicate 参数指定上一步中配置的 alwaysTruePredicate

    yaml
    spring:
      cloud:
        gateway:
          routes:
            - id: some-system
              uri: lb://some-system
              filters:
                - StripPrefix=1
              predicates:
                - Path=/some-system/**
                - name: GwReadBody
                  args:
                    inClass: '#{T(String)}'
                    predicate: '#{@alwaysTruePredicate}'
  4. 重启网关服务后,再次在 XssFilter 中通过 getBody() 获取请求内容时,获得的已经是完整的请求体了。