使用mongo-kafka组件订阅mongo的changeStream得到 一个带有很多转义符号的json字符串
"{\"_id\": {\"_data\": \"8267D0F733000001502B022C0100296E5A1004366730C56F7E41A790BDA4CF23259A4F46645F6964006467B91713A024A00E32CDF6800004\"}, \"operationType\": \"update\", \"clusterTime\": {\"$timestamp\": {\"t\": 1741748019, \"i\": 336}}, \"fullDocument\": {\"_id\": \"67b91713a024a00e32cdf680\", \"platformUID\": \"1863032388537794561\", \"createdAt\": \"2025-02-22T00:15:15.013Z\", \"meta\": {\"handle\": \"Petrada_he\", \"profile_image_url\": \"https://pbs.twimg.com/profile_images/1892687444467662849/PQ6qRoUj_normal.png\", \"description\": \"Cofounder @TheBigWhale_ I Journaliste I Auteur de \\\"Bitcoin Cryptos, l'enjeu du siècle\\\"\", \"followers\": 2, \"following\": 268, \"tweetCount\": 307}, \"monitorFollowingTeam\": [], \"platform\": \"twitter\", \"updatedAt\": \"2025-03-12T02:53:39.129Z\"}, \"ns\": {\"db\": \"bifrost\", \"coll\": \"socialaccounts\"}, \"documentKey\": {\"_id\": \"67b91713a024a00e32cdf680\"}, \"updateDescription\": {\"updatedFields\": {\"meta\": {\"handle\": \"Petrada_he\", \"profile_image_url\": \"https://pbs.twimg.com/profile_images/1892687444467662849/PQ6qRoUj_normal.png\", \"description\": \"Cofounder @TheBigWhale_ I Journaliste I Auteur de \\\"Bitcoin Cryptos, l'enjeu du siècle\\\"\", \"followers\": 2, \"following\": 268, \"tweetCount\": 307}, \"updatedAt\": \"2025-03-12T02:53:39.129Z\"}, \"removedFields\": [], \"truncatedArrays\": []}}"
这段json直接使用 fastjson直接解析会报Cast异常 :
JSONObject jsonObject = JSONObject.parseObject(rawString);
Caused by: java.lang.ClassCastException: java.lang.String cannot be cast to com.alibaba.fastjson.JSONObject
使用replace方法去掉转义符号会误杀 value中的转义符号:
JSONObject jsonObject = JSONObject.parseObject(rawString.replace("\\\"", "\""))
com.alibaba.fastjson.JSONException: not close json text, token : error
既想去掉key 两边的转义符号,又不误杀value中包含的转义符号,需要使用正则:
String output = rawString.replaceAll("(?<!\\\\)\\\\\"", "\"");
另一条数据 有特殊符号,正则处理不了
"{\"_id\": {\"_data\": \"8267D127240000012D2B022C0100296E5A1004366730C56F7E41A790BDA4CF23259A4F46645F6964006466EAB2B5000A4DD13F9615D20004\"}, \"operationType\": \"update\", \"clusterTime\": {\"$timestamp\": {\"t\": 1741760292, \"i\": 301}}, \"fullDocument\": {\"_id\": \"66eab2b5000a4dd13f9615d2\", \"platformUID\": \"1807346982681497600\", \"createdAt\": \"2024-09-18T11:00:05.753Z\", \"meta\": {\"handle\": \"cyberzone5758\", \"profile_image_url\": \"https://pbs.twimg.com/profile_images/1832300244513480706/deeABfbw_normal.jpg\", \"description\": \"²\\u207f\\u1d48 \\u1d43\\u1d9c\\u1d9c\\u1d52\\u1d58\\u207f\\u1d57 ⁻ \\u1d43\\u1d4f\\u1d58\\u207f \\u1d4f\\u1d49\\u1d48\\u1d58\\u1d43\\n\\n/Saya ini orang yang gampang, tapi jangan suka gampangin saya\\\\\", \"followers\": 87, \"following\": 233, \"tweetCount\": 1553}, \"monitorFollowingTeam\": [], \"platform\": \"twitter\", \"updatedAt\": \"2025-03-12T06:18:12.556Z\"}, \"ns\": {\"db\": \"bifrost\", \"coll\": \"socialaccounts\"}, \"documentKey\": {\"_id\": \"66eab2b5000a4dd13f9615d2\"}, \"updateDescription\": {\"updatedFields\": {\"meta\": {\"handle\": \"cyberzone5758\", \"profile_image_url\": \"https://pbs.twimg.com/profile_images/1832300244513480706/deeABfbw_normal.jpg\", \"description\": \"²\\u207f\\u1d48 \\u1d43\\u1d9c\\u1d9c\\u1d52\\u1d58\\u207f\\u1d57 ⁻ \\u1d43\\u1d4f\\u1d58\\u207f \\u1d4f\\u1d49\\u1d48\\u1d58\\u1d43\\n\\n/Saya ini orang yang gampang, tapi jangan suka gampangin saya\\\\\", \"followers\": 87, \"following\": 233, \"tweetCount\": 1553}, \"updatedAt\": \"2025-03-12T06:18:12.556Z\"}, \"removedFields\": [], \"truncatedArrays\": []}}"
ScriptEngineManager manager = new ScriptEngineManager();
ScriptEngine engine = manager.getEngineByName("JavaScript");
String formatedJson = engine.eval(rawString).toString();
使用js,将string当作代码行执行可去除转义符号